{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:35Z","timestamp":1772138075927,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"15","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3413,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Motivation: Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to genomic DNA is still a challenging task.<\/jats:p>\n                  <jats:p>Results: We present a novel approach based on large margin learning that combines accurate splice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm\u2014called PALMA\u2014tunes the parameters of the model such that true alignments score higher than other alignments. We study the accuracy of alignments of mRNAs containing artificially generated micro-exons to genomic DNA. In a carefully designed experiment, we show that our algorithm accurately identifies the intron boundaries as well as boundaries of the optimal local alignment. It outperforms all other methods: for 5702 artificially shortened EST sequences from Caenorhabditis elegans and human, it correctly identifies the intron boundaries in all except two cases. The best other method is a recently proposed method called exalin which misaligns 37 of the sequences. Our method also demonstrates robustness to mutations, insertions and deletions, retaining accuracy even at high noise levels.<\/jats:p>\n                  <jats:p>Availability: Datasets for training, evaluation and testing, additional results and a stand-alone alignment tool implemented in C++ and python are available at http:\/\/www.fml.mpg.de\/raetsch\/projects\/palma<\/jats:p>\n                  <jats:p>Contact: \u00a0Gunnar.Raetsch@tuebingen.mpg.de<\/jats:p>\n                  <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm275","type":"journal-article","created":{"date-parts":[[2007,5,30]],"date-time":"2007-05-30T20:18:18Z","timestamp":1180556298000},"page":"1892-1900","source":"Crossref","is-referenced-by-count":11,"title":["PALMA: mRNA to genome alignments using large margin algorithms"],"prefix":"10.1093","volume":"23","author":[{"given":"Uta","family":"Schulze","sequence":"first","affiliation":[{"name":"1 Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany, 2University of Leipzig, Johannisgasse 26, 04103 Leipzig, Germany, 3Fraunhofer FIRST, Kekul\u00e8str. 7, 12489 Berlin, Germany, and 4Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T\u00fcbingen, Germany"},{"name":"1 Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany, 2University of Leipzig, Johannisgasse 26, 04103 Leipzig, Germany, 3Fraunhofer FIRST, Kekul\u00e8str. 7, 12489 Berlin, Germany, and 4Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T\u00fcbingen, Germany"}]},{"given":"Bettina","family":"Hepp","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany, 2University of Leipzig, Johannisgasse 26, 04103 Leipzig, Germany, 3Fraunhofer FIRST, Kekul\u00e8str. 7, 12489 Berlin, Germany, and 4Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T\u00fcbingen, Germany"}]},{"given":"Cheng Soon","family":"Ong","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany, 2University of Leipzig, Johannisgasse 26, 04103 Leipzig, Germany, 3Fraunhofer FIRST, Kekul\u00e8str. 7, 12489 Berlin, Germany, and 4Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T\u00fcbingen, Germany"},{"name":"1 Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany, 2University of Leipzig, Johannisgasse 26, 04103 Leipzig, Germany, 3Fraunhofer FIRST, Kekul\u00e8str. 7, 12489 Berlin, Germany, and 4Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T\u00fcbingen, Germany"}]},{"given":"Gunnar","family":"R\u00e4tsch","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany, 2University of Leipzig, Johannisgasse 26, 04103 Leipzig, Germany, 3Fraunhofer FIRST, Kekul\u00e8str. 7, 12489 Berlin, Germany, and 4Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T\u00fcbingen, Germany"}]}],"member":"286","published-online":{"date-parts":[[2007,5,30]]},"reference":[{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023041105320589800_","article-title":"Hidden markov support vector machines","volume-title":"Proceedings of 20th International Conference on Machine Learning","author":"Altun","year":"2003"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1016\/0022-2836(87)90354-8","article-title":"Selection of DNA binding sites by regulatory proteins. statistical-mechanical theory and application to operators and promoters","volume":"193","author":"Berg","year":"1987","journal-title":"J. Mol. Biol."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"332","DOI":"10.1038\/ng0893-332","article-title":"dbEST \u2013 Database for \u201cexpressed sequence tags\u201d","volume":"4","author":"Boguski","year":"1993","journal-title":"Nat. Genet."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological sequence analysis: Probabilistic models of proteins and nucleic acids","author":"Durbin","year":"1998","edition":"7th edn"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1101\/gr.8.9.967","article-title":"A computer program for aligning a cDNA sequence with a genomic DNA sequence","volume":"8","author":"Florea","year":"1998","journal-title":"Genome Res."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1006\/jcss.1997.1504","article-title":"A decision-theoretic generalization of on-line learning and an application to boosting","volume":"55","author":"Freund","year":"1997","journal-title":"J. Comput. Sys. Sci."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"9061","DOI":"10.1073\/pnas.93.17.9061","article-title":"Gene recognition via spliced sequence alignment","volume":"93","author":"Gelfand","year":"1996","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1007\/BF01185430","article-title":"Parametric optimization of sequence alignment","volume":"12","author":"Gusfield","year":"1994","journal-title":"Algorithmica"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"D411","DOI":"10.1093\/nar\/gkh066","article-title":"Wormbase: A multi-species resource for nematode biology and genomics","volume":"32","author":"Harris","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1137\/1035089","article-title":"Semi-infinite programming: Theory, methods and applications","volume":"3","author":"Hettich","year":"1993","journal-title":"SIAM Rev."},{"key":"2023041105320589800_","first-page":"57","article-title":"Learning to align sequences: a maximum-margin approach","volume-title":"New Algorithms for Macromolecular Simulation","author":"Joachims","year":"2005"},{"key":"2023041105320589800_","first-page":"441","article-title":"Simple and fast inverse alignment","volume-title":"RECOMB","author":"Kececioglu","year":"2006"},{"key":"2023041105320589800_","first-page":"656","article-title":"BLAT\u2013the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1007\/3-540-36434-X_4","article-title":"An introduction to boosting and leveraging","volume-title":"Advanced Lectures on Machine Learning","author":"Meir","year":"2003"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1109\/72.914517","article-title":"An introduction to kernel-based learning algorithms","volume":"12","author":"M\u00fcller","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"i369","DOI":"10.1093\/bioinformatics\/bti1053","article-title":"RASE: Recognition of alternatively spliced exons in C.elegans","volume":"21","author":"R\u00e4tsch","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"S9","DOI":"10.1186\/1471-2105-7-S1-S9","article-title":"Learning interpretable svms for biological sequence classification","volume":"7","author":"R\u00e4tsch","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041105320589800_","first-page":"104","article-title":"PALMA: Perfect alignments using large margin algorithms","volume-title":"German Conference on Bioinformatics","author":"R\u00e4tsch","year":"2006"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","unstructured":"R\u00e4tsch G \u00a0et al. Improving the C. elegans genome annotation using machine learning PLoS Comput. Biol. 2007 3 e20 10.1371\/journal.pcbi.0030020.eor","DOI":"10.1371\/journal.pcbi.0030020"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","DOI":"10.1007\/3-540-46084-5_54","article-title":"New methods for splice-site recognition","volume-title":"Procedings of. International Conference on Artificial Neural Networks","author":"Sonnenburg","year":"2002"},{"key":"2023041105320589800_","article-title":"Accurate splice site recognition using SVMs","volume-title":"BMC Bioinformatics","author":"Sonnenburg","year":"2007"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1146\/annurev.bb.17.060188.001325","article-title":"Computer methods for analyzing sequence recognition of nucleic acids","volume":"17","author":"Stormo","year":"1988","journal-title":"Annu. Rev. Biophys. Biophys. Chem."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1093\/bioinformatics\/16.3.203","article-title":"Optimal spliced alignment of homologous cDNA to a genomic DNA template","volume":"16","author":"Usuka","year":"2000","journal-title":"Bioinformatics"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning Theory","author":"Vapnik","year":"1995"},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"1216","DOI":"10.1101\/gr.677503","article-title":"Computational discovery of internal micro-exons","volume":"13","author":"Volfovsky","year":"2003","journal-title":"Genome Res."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"1952","DOI":"10.1101\/gr.195301","article-title":"Spidey: a tool for mRNA-to-genomic alignments","volume":"11","author":"Wheelan","year":"2001","journal-title":"Genome Res."},{"key":"2023041105320589800_","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1093\/bioinformatics\/bti748","article-title":"Improved spliced alignment from an information theoretic approach","volume":"22","author":"Zhang","year":"2006","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/15\/1892\/49815288\/bioinformatics_23_15_1892.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/15\/1892\/49815288\/bioinformatics_23_15_1892.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,14]],"date-time":"2024-02-14T12:16:53Z","timestamp":1707913013000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/15\/1892\/203970"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,5,30]]},"references-count":29,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2007,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm275","relation":{"has-review":[{"id-type":"doi","id":"10.3410\/f.1160397.620743","asserted-by":"object"}]},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,8]]},"published":{"date-parts":[[2007,5,30]]}}}