{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T13:39:32Z","timestamp":1772199572221,"version":"3.50.1"},"reference-count":58,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2018,7,10]],"date-time":"2018-07-10T00:00:00Z","timestamp":1531180800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"VW Foundation","award":["VWZN3157"],"award-info":[{"award-number":["VWZN3157"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Most methods for pairwise and multiple genome alignment use fast local homology search tools to identify anchor points, i.e. high-scoring local alignments of the input sequences. Sequence segments between those anchor points are then aligned with slower, more sensitive methods. Finding suitable anchor points is therefore crucial for genome sequence comparison; speed and sensitivity of genome alignment depend on the underlying anchoring methods.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this article, we use filtered spaced word matches to generate anchor points for genome alignment. For a given binary pattern representing match and don\u2019t-care positions, we first search for spaced-word matches, i.e. ungapped local pairwise alignments with matching nucleotides at the match positions of the pattern and possible mismatches at the don\u2019t-care positions. Those spaced-word matches that have similarity scores above some threshold value are then extended using a standard X-drop algorithm; the resulting local alignments are used as anchor points. To evaluate this approach, we used the popular multiple-genome-alignment pipeline Mugsy and replaced the exact word matches that Mugsy uses as anchor points with our spaced-word-based anchor points. For closely related genome sequences, the two anchoring procedures lead to multiple alignments of similar quality. For distantly related genomes, however, alignments calculated with our filtered-spaced-word matches are superior to alignments produced with the original Mugsy program where exact word matches are used to find anchor points.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>http:\/\/spacedanchor.gobics.de<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty592","type":"journal-article","created":{"date-parts":[[2018,7,9]],"date-time":"2018-07-09T15:32:39Z","timestamp":1531150359000},"page":"211-218","source":"Crossref","is-referenced-by-count":16,"title":["Accurate multiple alignment of distantly related genome sequences using filtered spaced word matches as anchor points"],"prefix":"10.1093","volume":"35","author":[{"given":"Chris-Andr\u00e9","family":"Leimeister","sequence":"first","affiliation":[{"name":"Department of Bioinformatics, Institute of Microbiology and Genetics"}]},{"given":"Thomas","family":"Dencker","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics, Institute of Microbiology and Genetics"}]},{"given":"Burkhard","family":"Morgenstern","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics, Institute of Microbiology and Genetics"},{"name":"Center for Computational Sciences, University of Goettingen, Goettingen, Germany"}]}],"member":"286","published-online":{"date-parts":[[2018,7,10]]},"reference":[{"key":"2023013107224964000_bty592-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023013107224964000_bty592-B2","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1093\/bioinformatics\/btq665","article-title":"Mugsy: fast multiple alignment of closely related whole genomes","volume":"27","author":"Angiuoli","year":"2011","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B3","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1093\/bib\/6.1.6","article-title":"The many faces of sequence alignment","volume":"6","author":"Batzoglou","year":"2005","journal-title":"Brief. Bioinformatics"},{"key":"2023013107224964000_bty592-B4","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1101\/gr.1933104","article-title":"Aligning multiple genomic sequences with the threaded blockset aligner","volume":"14","author":"Blanchette","year":"2004","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B5","doi-asserted-by":"crossref","first-page":"e1000392.","DOI":"10.1371\/journal.pcbi.1000392","article-title":"Fast statistical alignment","volume":"5","author":"Bradley","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023013107224964000_bty592-B6","doi-asserted-by":"crossref","first-page":"3525","DOI":"10.1093\/nar\/gkg623","article-title":"MAVID multiple alignment server","volume":"31","author":"Bray","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023013107224964000_bty592-B7","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1101\/gr.789803","article-title":"AVID: a Global Alignment Program","volume":"13","author":"Bray","year":"2003","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B8","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1016\/j.jcss.2004.12.008","article-title":"Vector seeds: an extension to spaced seeds","volume":"70","author":"Brejova","year":"2005","journal-title":"J. Comp. Syst. Sci"},{"key":"2023013107224964000_bty592-B9","doi-asserted-by":"crossref","first-page":"3584","DOI":"10.1093\/bioinformatics\/btv419","article-title":"Spaced seeds improve k-mer-based metagenomic classification","volume":"31","author":"B\u0159inda","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B10","doi-asserted-by":"crossref","first-page":"66.","DOI":"10.1186\/1471-2105-4-66","article-title":"Fast and sensitive multiple alignment of large genomic sequences","volume":"4","author":"Brudno","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023013107224964000_bty592-B11","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1101\/gr.926603","article-title":"LAGAN and multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA","volume":"13","author":"Brudno","year":"2003","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B12","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat. Methods"},{"key":"2023013107224964000_bty592-B13","first-page":"115","volume-title":"Pacific Symposium on Biocomputing","author":"Chiaromonte","year":"2002"},{"key":"2023013107224964000_bty592-B14","doi-asserted-by":"crossref","first-page":"1053.","DOI":"10.1093\/bioinformatics\/bth037","article-title":"Good spaced seeds for homology search","volume":"20","author":"Choi","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B15","doi-asserted-by":"crossref","first-page":"1115","DOI":"10.1093\/molbev\/msr268","article-title":"ALF: a simulation framework for genome evolution","volume":"29","author":"Dalquen","year":"2012","journal-title":"Mol. Biol. Evol"},{"key":"2023013107224964000_bty592-B16","doi-asserted-by":"crossref","first-page":"1394","DOI":"10.1101\/gr.2289704","article-title":"Mauve: multiple alignment of conserved genomic sequence with rearrangements","volume":"14","author":"Darling","year":"2004","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B17","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1007\/11851561_12","volume-title":"Algorithms in Bioinformatics, Lecture Notes in Bioinformatics","author":"Darling","year":"2006"},{"key":"2023013107224964000_bty592-B18","doi-asserted-by":"crossref","first-page":"e11147.","DOI":"10.1371\/journal.pone.0011147","article-title":"progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement","volume":"5","author":"Darling","year":"2010","journal-title":"PLoS One"},{"key":"2023013107224964000_bty592-B19","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1093\/bioinformatics\/btr046","article-title":"SHRiMP2: sensitive yet practical short read mapping","volume":"27","author":"David","year":"2011","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B20","doi-asserted-by":"crossref","first-page":"2369","DOI":"10.1093\/nar\/27.11.2369","article-title":"Alignment of whole genomes","volume":"27","author":"Delcher","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2023013107224964000_bty592-B21","doi-asserted-by":"crossref","first-page":"39194.","DOI":"10.1038\/srep39194","article-title":"PaPrBaG: a machine learning approach for the detection of novel pathogens from NGS data","volume":"7","author":"Deneke","year":"2017","journal-title":"Sci. Rep"},{"key":"2023013107224964000_bty592-B22","doi-asserted-by":"crossref","first-page":"R51","DOI":"10.1093\/hmg\/ddl056","article-title":"Evolution at the nucleotide level: the problem of multiple whole-genome alignment","volume":"15","author":"Dewey","year":"2006","journal-title":"Hum. Mol. Genet"},{"key":"2023013107224964000_bty592-B23","doi-asserted-by":"crossref","first-page":"11.","DOI":"10.1186\/1471-2105-9-11","article-title":"SeqAn\u2014an efficient, generic C++ library for sequence analysis","volume":"9","author":"D\u00f6ring","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023013107224964000_bty592-B24","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1101\/gr.081778.108","article-title":"Multiple whole-genome alignments without a reference organism","volume":"19","author":"Dubchak","year":"2009","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B25","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis","author":"Durbin","year":"1998"},{"key":"2023013107224964000_bty592-B26","doi-asserted-by":"crossref","first-page":"2077","DOI":"10.1101\/gr.174920.114","article-title":"Alignathon: a competitive assessment of whole-genome alignment methods","volume":"24","author":"Earl","year":"2014","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B27","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1016\/0022-2836(82)90398-9","article-title":"An improved algorithm for matching biological sequences","volume":"162","author":"Gotoh","year":"1982","journal-title":"J. Mol. Biol"},{"key":"2023013107224964000_bty592-B28","doi-asserted-by":"crossref","first-page":"e1005107","DOI":"10.1371\/journal.pcbi.1005107","article-title":"rasbhari: optimizing spaced seeds for database searching, read mapping and alignment-free sequence comparison","volume":"12","author":"Hahn","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023013107224964000_bty592-B29","doi-asserted-by":"crossref","first-page":"1169","DOI":"10.1093\/bioinformatics\/btu815","article-title":"andi: fast and accurate estimation of evolutionary distances between closely related genomes","volume":"31","author":"Haubold","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B30","doi-asserted-by":"crossref","first-page":"i349","DOI":"10.1093\/bioinformatics\/btu439","article-title":"Lambda: the local aligner for massive biological data","volume":"30","author":"Hauswedell","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B31","doi-asserted-by":"crossref","first-page":"S312","DOI":"10.1093\/bioinformatics\/18.suppl_1.S312","article-title":"Efficient multiple genome alignment","volume":"18","author":"H\u00f6hl","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B32","doi-asserted-by":"crossref","first-page":"W7","DOI":"10.1093\/nar\/gku398","article-title":"Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches","volume":"42","author":"Horwege","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023013107224964000_bty592-B33","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1093\/bioinformatics\/bti772","article-title":"Accurate anchoring alignment of divergent sequences","volume":"22","author":"Huang","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B34","doi-asserted-by":"crossref","first-page":"1149","DOI":"10.1002\/(SICI)1097-024X(199911)29:13<1149::AID-SPE274>3.0.CO;2-O","article-title":"Reducing the space requirement of suffix trees","volume":"29","author":"Kurtz","year":"1999","journal-title":"Softw. Pract. Exp"},{"key":"2023013107224964000_bty592-B35","doi-asserted-by":"crossref","first-page":"R12.","DOI":"10.1186\/gb-2004-5-2-r12","article-title":"Versatile and open software for comparing large genomes","volume":"5","author":"Kurtz","year":"2004","journal-title":"Genome Biol"},{"key":"2023013107224964000_bty592-B36","doi-asserted-by":"crossref","first-page":"R25.","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol"},{"key":"2023013107224964000_bty592-B37","doi-asserted-by":"crossref","first-page":"1991","DOI":"10.1093\/bioinformatics\/btu177","article-title":"Fast Alignment-Free sequence comparison using spaced-word frequencies","volume":"30","author":"Leimeister","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B38","doi-asserted-by":"crossref","first-page":"971","DOI":"10.1093\/bioinformatics\/btw776","article-title":"Fast and accurate phylogeny reconstruction using filtered spaced-word matches","volume":"33","author":"Leimeister","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B39","first-page":"164","article-title":"PatternHunter II: highly sensitive and fast homology search","volume":"14","author":"Li","year":"2003","journal-title":"Genome Informatics"},{"key":"2023013107224964000_bty592-B40","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1145\/1109557.1109607","volume-title":"Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, SODA \u201906","author":"Li","year":"2006"},{"key":"2023013107224964000_bty592-B41","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1093\/bioinformatics\/18.3.440","article-title":"PatternHunter: faster and more sensitive homology search","volume":"18","author":"Ma","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B42","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/S0893-9659(01)00085-4","article-title":"A simple and space-efficient fragment-chaining algorithm for alignment of DNA and protein sequences","volume":"15","author":"Morgenstern","year":"2002","journal-title":"Appl. Math. Lett"},{"key":"2023013107224964000_bty592-B43","doi-asserted-by":"crossref","first-page":"12098","DOI":"10.1073\/pnas.93.22.12098","article-title":"Multiple DNA and protein sequence alignment based on segment-to-segment comparison","volume":"93","author":"Morgenstern","year":"1996","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023013107224964000_bty592-B44","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1093\/bioinformatics\/18.6.777","article-title":"Exon discovery by genomic sequence alignment","volume":"18","author":"Morgenstern","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B45","doi-asserted-by":"crossref","first-page":"6.","DOI":"10.1186\/1748-7188-1-6","article-title":"Multiple sequence alignment with user-defined anchor points","volume":"1","author":"Morgenstern","year":"2006","journal-title":"Algorith. Mol. Biol"},{"key":"2023013107224964000_bty592-B46","doi-asserted-by":"crossref","first-page":"5.","DOI":"10.1186\/s13015-015-0032-x","article-title":"Estimating evolutionary distances between genomic sequences from spaced-word matches","volume":"10","author":"Morgenstern","year":"2015","journal-title":"Algorith. Mol. Biol"},{"key":"2023013107224964000_bty592-B47","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol"},{"key":"2023013107224964000_bty592-B48","article-title":"Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds","volume":"12","author":"No\u00e9","year":"2017","journal-title":"Algorith. Mole. Biol"},{"key":"2023013107224964000_bty592-B49","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2010\/708501","article-title":"Designing efficient spaced seeds for SOLiD read mapping","volume":"2010","author":"No\u00e9","year":"2010","journal-title":"Adv. Bioinformatics"},{"key":"2023013107224964000_bty592-B50","doi-asserted-by":"crossref","first-page":"1703","DOI":"10.1093\/bioinformatics\/18.12.1703","article-title":"OWEN: aligning long collinear regions of genomes","volume":"18","author":"Ogurtsov","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B51","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1007\/978-3-662-48221-6_21","volume-title":"Algorithms in Bioinformatics: 15th International Workshop, WABI 2015, Atlanta, GA, USA, September 10\u201312, 2015, Proceedings","author":"Ounit","year":"2015"},{"key":"2023013107224964000_bty592-B52","doi-asserted-by":"crossref","first-page":"1512","DOI":"10.1101\/gr.123356.111","article-title":"Cactus: algorithms for genome multiple sequence alignment","volume":"21","author":"Paten","year":"2011","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B53","doi-asserted-by":"crossref","first-page":"2336","DOI":"10.1101\/gr.2657504","article-title":"A novel method for multiple alignment of sequences with repeated and shuffled elements","volume":"14","author":"Raphael","year":"2004","journal-title":"Genome Res"},{"key":"2023013107224964000_bty592-B54","doi-asserted-by":"crossref","first-page":"i187","DOI":"10.1093\/bioinformatics\/btn281","article-title":"Segment-based multiple sequence alignment","volume":"24","author":"Rausch","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013107224964000_bty592-B55","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol"},{"key":"2023013107224964000_bty592-B56","doi-asserted-by":"crossref","first-page":"796","DOI":"10.1038\/35048692","article-title":"Analysis of the genome sequence of the flowering plant Arabidopsis thaliana","volume":"408","author":"The Arabidopsis Genome Initiative","year":"2000","journal-title":"Nature"},{"key":"2023013107224964000_bty592-B57","doi-asserted-by":"crossref","first-page":"1355","DOI":"10.1089\/cmb.2006.13.1355","article-title":"Optimizing multiple spaced seeds for homology search","volume":"13","author":"Xu","year":"2006","journal-title":"J. Comput. Biol"},{"key":"2023013107224964000_bty592-B58","doi-asserted-by":"crossref","first-page":"e75.","DOI":"10.1093\/nar\/gkt003","article-title":"Co-phylog: an assembly-free phylogenomic approach for closely related organisms","volume":"41","author":"Yi","year":"2013","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/2\/211\/48963275\/bioinformatics_35_2_211.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/2\/211\/48963275\/bioinformatics_35_2_211.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,3]],"date-time":"2023-09-03T12:30:19Z","timestamp":1693744219000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/2\/211\/5051199"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,7,10]]},"references-count":58,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2019,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty592","relation":{"has-review":[{"id-type":"doi","id":"10.3410\/f.733626135.793555265","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,1,15]]},"published":{"date-parts":[[2018,7,10]]}}}