{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T14:51:27Z","timestamp":1761663087118},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Cytosine DNA methylation is one of the major epigenetic modifications and influences gene expression, developmental processes, X-chromosome inactivation, and genomic imprinting. Aberrant methylation is furthermore known to be associated with several diseases including cancer. The gold standard to determine DNA methylation on genome-wide scales is \u2018bisulfite sequencing\u2019: DNA fragments are treated with sodium bisulfite resulting in the conversion of unmethylated cytosines into uracils, whereas methylated cytosines remain unchanged. The resulting sequencing reads thus exhibit asymmetric bisulfite-related mismatches and suffer from an effective reduction of the alphabet size in the unmethylated regions, rendering the mapping of bisulfite sequencing reads computationally much more demanding. As a consequence, currently available read mapping software often fails to achieve high sensitivity and in many cases requires unrealistic computational resources to cope with large real-life datasets.<\/jats:p>\n               <jats:p>Results: In this study, we present a seed-based approach based on enhanced suffix arrays in conjunction with Myers bit-vector algorithm to efficiently extend seeds to optimal semi-global alignments while allowing for bisulfite-related substitutions. It outperforms most current approaches in terms of sensitivity and performs time-competitive in mapping hundreds of millions of sequencing reads to vertebrate genomes.<\/jats:p>\n               <jats:p>Availability: The software segemehl is freely available at http:\/\/www.bioinf.uni-leipzig.de\/Software\/segemehl.<\/jats:p>\n               <jats:p>Contact: E-mail: steve@bioinf.uni-leipzig.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts254","type":"journal-article","created":{"date-parts":[[2012,5,12]],"date-time":"2012-05-12T01:02:48Z","timestamp":1336784568000},"page":"1698-1704","source":"Crossref","is-referenced-by-count":46,"title":["Fast and sensitive mapping of bisulfite-treated sequencing data"],"prefix":"10.1093","volume":"28","author":[{"given":"Christian","family":"Otto","sequence":"first","affiliation":[{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"}]},{"given":"Peter F.","family":"Stadler","sequence":"additional","affiliation":[{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"}]},{"given":"Steve","family":"Hoffmann","sequence":"additional","affiliation":[{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"},{"name":"1 Interdisciplinary Center for Bioinformatics and Bioinformatics Group, Department of Computer Science, University Leipzig, 04107 Leipzig, Germany, 2Transcriptome Bioinformatics Group, LIFE \u2014 Leipzig Research Center for Civilization Diseases, University Leipzig, 04107 Leipzig, Germany, 3RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany, 4Santa Fe Institute, Santa Fe, NM 87501 USA, 5Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria and 6Max-Planck-Institute for Mathematics in Sciences, 04103 Leipzig, Germany"}]}],"member":"286","published-online":{"date-parts":[[2012,5,10]]},"reference":[{"key":"2023012512380249700_B1","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/S1570-8667(03)00065-0","article-title":"Replacing suffix trees with enhanced suffix arrays","volume":"2","author":"Abouelhoda","year":"2004","journal-title":"J. Discrete Algor."},{"key":"2023012512380249700_B2","doi-asserted-by":"crossref","first-page":"14616","DOI":"10.1073\/pnas.0704665104","article-title":"Patterns of damage in genomic DNA sequences from a Neandertal","volume":"104","author":"Briggs","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512380249700_B3","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1038\/nrg2540","article-title":"Linking DNA methylation and histone modification: patterns and paradigms","volume":"10","author":"Cedar","year":"2009","journal-title":"Nat. Rev. Genet."},{"key":"2023012512380249700_B4","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1186\/1471-2105-11-203","article-title":"BS Seeker: precise mapping for bisulfite sequencing","volume":"11","author":"Chen","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012512380249700_B5","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1038\/nature06745","article-title":"Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning","volume":"452","author":"Cokus","year":"2008","journal-title":"Nature"},{"key":"2023012512380249700_B6","doi-asserted-by":"crossref","first-page":"2157","DOI":"10.1126\/science.1080049","article-title":"The draft genome of ciona intestinalis: insights into chordate and vertebrate origins","volume":"298","author":"Dehal","year":"2002","journal-title":"Science"},{"key":"2023012512380249700_B7","volume-title":"DNA Methylation: Approaches, Methods, and Applications.","author":"Esteller","year":"2005"},{"key":"2023012512380249700_B8","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1038\/nrg2005","article-title":"Cancer epigenomics: DNA methylomes and histone-modification maps","volume":"8","author":"Esteller","year":"2007","journal-title":"Nat. Rev. Genet."},{"key":"2023012512380249700_B9","doi-asserted-by":"crossref","first-page":"1827","DOI":"10.1073\/pnas.89.5.1827","article-title":"A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands","volume":"89","author":"Frommer","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512380249700_B10","doi-asserted-by":"crossref","first-page":"1447","DOI":"10.1126\/science.1171609","article-title":"Extensive demethylation of repetitive elements during seed development underlies gene imprinting","volume":"324","author":"Gehring","year":"2009","journal-title":"Science"},{"key":"2023012512380249700_B11","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1038\/ng.865","article-title":"Increased methylation variation in epigenetic domains across cancer types","volume":"43","author":"Hansen","year":"2011","journal-title":"Nat. Genet."},{"key":"2023012512380249700_B12","doi-asserted-by":"crossref","first-page":"e1000502","DOI":"10.1371\/journal.pcbi.1000502","article-title":"Fast mapping of short sequences with mismatches, insertions and deletions using index structures","volume":"5","author":"Hoffmann","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012512380249700_B13","doi-asserted-by":"crossref","first-page":"e8888","DOI":"10.1371\/journal.pone.0008888","article-title":"The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing","volume":"5","author":"Huang","year":"2010","journal-title":"PLoS One"},{"key":"2023012512380249700_B14","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1007\/3-540-44888-8_15","article-title":"Space efficient linear time construction of suffix arrays","volume-title":"Combinatorial Pattern Matching (CPM 03)","author":"Ko","year":"2003"},{"key":"2023012512380249700_B15","doi-asserted-by":"crossref","first-page":"1571","DOI":"10.1093\/bioinformatics\/btr167","article-title":"Bismark: a flexible aligner and methylation caller for bisulfite-seq applications","volume":"27","author":"Krueger","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012512380249700_B16","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012512380249700_B17","doi-asserted-by":"crossref","first-page":"959","DOI":"10.1101\/gr.083451.108","article-title":"Finding the fifth base: genome-wide sequencing of cytosine methylation","volume":"19","author":"Lister","year":"2009","journal-title":"Genome Res."},{"key":"2023012512380249700_B18","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1038\/nature08514","article-title":"Human DNA methylomes at base resolution show widespread epigenomic differences","volume":"462","author":"Lister","year":"2009","journal-title":"Nature"},{"key":"2023012512380249700_B19","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature09798","article-title":"Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells","volume":"471","author":"Lister","year":"2011","journal-title":"Nature"},{"key":"2023012512380249700_B20","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012512380249700_B21","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The Sequence Alignment\/Map format and SAM tools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012512380249700_B22","doi-asserted-by":"crossref","first-page":"e1000533","DOI":"10.1371\/journal.pbio.1000533","article-title":"The DNA methylome of human peripheral blood mononuclear cells","volume":"8","author":"Li","year":"2010","journal-title":"PLoS Biol."},{"key":"2023012512380249700_B23","doi-asserted-by":"crossref","first-page":"e1000506","DOI":"10.1371\/journal.pbio.1000506","article-title":"The honey bee epigenomes: differential methylation of brain DNA in queens and workers","volume":"8","author":"Lyko","year":"2010","journal-title":"PLoS Biol."},{"key":"2023012512380249700_B24","doi-asserted-by":"crossref","first-page":"5868","DOI":"10.1093\/nar\/gki901","article-title":"Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis","volume":"33","author":"Meissner","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512380249700_B25","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1101\/gr.115907.110","article-title":"Natural genetic variation caused by small insertions and deletions in the human genome","volume":"21","author":"Mills","year":"2011","journal-title":"Genome Res."},{"key":"2023012512380249700_B26","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1145\/316542.316550","article-title":"A fast bit-vector algorithm for approximate string matching based on dynamic programming","volume":"46","author":"Myers","year":"1999","journal-title":"J. ACM"},{"key":"2023012512380249700_B27","doi-asserted-by":"crossref","first-page":"R47","DOI":"10.1186\/gb-2010-11-5-r47","article-title":"Computational challenges in the analysis of ancient DNA","volume":"11","author":"Pr\u00fcfer","year":"2010","journal-title":"Genome Biol."},{"key":"2023012512380249700_B28","doi-asserted-by":"crossref","first-page":"1064","DOI":"10.1038\/nature06967","article-title":"The amphioxus genome and the evolution of the chordate karyotype","volume":"453","author":"Putnam","year":"2008","journal-title":"Nature"},{"key":"2023012512380249700_B29","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1038\/nature05918","article-title":"Stability and flexibility of epigenetic gene regulation in mammalian development","volume":"447","author":"Reik","year":"2007","journal-title":"Nature"},{"key":"2023012512380249700_B30","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1093\/nar\/gkp992","article-title":"MBD-isolated genome sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome","volume":"38","author":"Serre","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012512380249700_B31","doi-asserted-by":"crossref","first-page":"2841","DOI":"10.1093\/bioinformatics\/btp533","article-title":"Updates to the RMAP short-read mapping software","volume":"25","author":"Smith","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012512380249700_B32","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/S0065-2423(10)52006-7","article-title":"Methylation of DNA in cancer","volume":"52","author":"Watanabe","year":"2010","journal-title":"Adv. Clin. Chem."},{"key":"2023012512380249700_B33","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1038\/ng1598","article-title":"Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells","volume":"37","author":"Weber","year":"2005","journal-title":"Nature Genet."},{"key":"2023012512380249700_B34","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/j.ceb.2007.04.011","article-title":"Genomic patterns of DNA methylation: targets and function of an epigenetic mark","volume":"19","author":"Weber","year":"2007","journal-title":"Curr. Opin. Cell. Biol."},{"key":"2023012512380249700_B35","doi-asserted-by":"crossref","first-page":"516","DOI":"10.1038\/nbt.1626","article-title":"Single base-resolution methylome of the silkworm reveals a sparse epigenomic map","volume":"28","author":"Xiang","year":"2010","journal-title":"Nat. Biotechnol"},{"key":"2023012512380249700_B36","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1186\/1471-2105-10-232","article-title":"BSMAP: whole genome bisulfite sequence mapping program","volume":"10","author":"Xi","year":"2009","journal-title":"sBMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/13\/1698\/48866872\/bioinformatics_28_13_1698.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/13\/1698\/48866872\/bioinformatics_28_13_1698.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T16:39:31Z","timestamp":1674664771000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/13\/1698\/235152"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,5,10]]},"references-count":36,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2012,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts254","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,7,1]]},"published":{"date-parts":[[2012,5,10]]}}}