{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,28]],"date-time":"2025-09-28T12:50:38Z","timestamp":1759063838763},"reference-count":12,"publisher":"Hindawi Limited","license":[{"start":{"date-parts":[[2012,1,1]],"date-time":"2012-01-01T00:00:00Z","timestamp":1325376000000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"funder":[{"DOI":"10.13039\/501100008530","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","doi-asserted-by":"crossref","award":["SFRH\/BPD\/43646\/2008","PEST-C\/MAT\/UI0144\/2011"],"award-info":[{"award-number":["SFRH\/BPD\/43646\/2008","PEST-C\/MAT\/UI0144\/2011"]}],"id":[{"id":"10.13039\/501100008530","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["The Scientific World Journal"],"published-print":{"date-parts":[[2012]]},"abstract":"<jats:p>The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions\/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length<jats:italic>L<\/jats:italic>\u2014<jats:italic>L-words<\/jats:italic>\u2014in each sequence is rapidly calculated. Based on the<jats:italic>L-words<\/jats:italic>frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.<\/jats:p>","DOI":"10.1100\/2012\/450124","type":"journal-article","created":{"date-parts":[[2012,9,10]],"date-time":"2012-09-10T22:00:22Z","timestamp":1347314422000},"page":"1-4","source":"Crossref","is-referenced-by-count":11,"title":["Sequence Comparison Alignment-Free Approach Based on Suffix Tree and<i>L-Words<\/i>Frequency"],"prefix":"10.1100","volume":"2012","author":[{"given":"In\u00eas","family":"Soares","sequence":"first","affiliation":[{"name":"Faculdade de Ci\u00eancias da Universidade do Porto, 4169 Porto, Portugal"},{"name":"Instituto de Patologia e Imunologia Molecular da Universidade do Porto, 4200 Porto, Portugal"},{"name":"Centro de Matem\u00e1tica da Universidade do Porto, 4169 Porto, Portugal"}]},{"given":"Ana","family":"Goios","sequence":"additional","affiliation":[{"name":"Instituto de Patologia e Imunologia Molecular da Universidade do Porto, 4200 Porto, Portugal"}]},{"given":"Ant\u00f3nio","family":"Amorim","sequence":"additional","affiliation":[{"name":"Faculdade de Ci\u00eancias da Universidade do Porto, 4169 Porto, Portugal"},{"name":"Instituto de Patologia e Imunologia Molecular da Universidade do Porto, 4200 Porto, Portugal"}]}],"member":"98","reference":[{"key":"1","doi-asserted-by":"publisher","DOI":"10.1006\/mpev.2000.0792"},{"key":"2","doi-asserted-by":"publisher","DOI":"10.1186\/gb-2008-9-10-r151"},{"key":"3","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0007767"},{"key":"4","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btq665"},{"key":"5","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btg005"},{"key":"6","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkh340"},{"key":"7","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0813249106"},{"key":"9","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btr131"},{"key":"8","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btp590"},{"key":"10","year":"1997"},{"key":"11","doi-asserted-by":"publisher","DOI":"10.1504\/IJBRA.2008.017165"},{"key":"12","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0041175"}],"container-title":["The Scientific World Journal"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/tswj\/2012\/450124.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/tswj\/2012\/450124.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/tswj\/2012\/450124.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,5,7]],"date-time":"2020-05-07T05:15:49Z","timestamp":1588828549000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.hindawi.com\/journals\/tswj\/2012\/450124\/"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012]]},"references-count":12,"alternative-id":["450124","450124"],"URL":"https:\/\/doi.org\/10.1100\/2012\/450124","relation":{},"ISSN":["1537-744X"],"issn-type":[{"value":"1537-744X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012]]}}}