{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T03:07:47Z","timestamp":1774494467283,"version":"3.50.1"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2018,8,28]],"date-time":"2018-08-28T00:00:00Z","timestamp":1535414400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"RepeatALS FGBR"},{"name":"Italian Society for Research on Amyotrophic Lateral Sclerosis","award":["PRIN 201534HNXC"],"award-info":[{"award-number":["PRIN 201534HNXC"]}]},{"name":"Italian Ministry of Education and University"},{"DOI":"10.13039\/501100003407","name":"MIUR","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003407","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Large-scale sequencing projects have confirmed the hypothesis that eukaryotic DNA is rich in repetitions whose functional role needs to be elucidated. In particular, tandem repeats (TRs) (i.e. short, almost identical sequences that lie adjacent to each other) have been associated to many cellular processes and, indeed, are also involved in several genetic disorders. The need of comprehensive lists of TRs for association studies and the absence of a computational model able to capture their variability have revived research on discovery algorithms.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Building upon the idea that sequence similarities can be easily displayed using graphical methods, we formalized the structure that TRs induce in dot-plot matrices where a sequence is compared with itself. Leveraging on the observation that a compact representation of these matrices can be built and searched in linear time, we developed Dot2dot: an accurate algorithm fast enough to be suitable for whole-genome discovery of TRs. Experiments on five manually curated collections of TRs have shown that Dot2dot is more accurate than other established methods, and completes the analysis of the biggest known reference genome in about one day on a standard PC.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Source code and datasets are freely available upon paper acceptance at the URL: https:\/\/github.com\/Gege7177\/Dot2dot.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty747","type":"journal-article","created":{"date-parts":[[2018,8,24]],"date-time":"2018-08-24T15:13:30Z","timestamp":1535123610000},"page":"914-922","source":"Crossref","is-referenced-by-count":21,"title":["<i>Dot2dot<\/i>\n                    : accurate whole-genome tandem repeats discovery"],"prefix":"10.1093","volume":"35","author":[{"given":"Loredana M","family":"Genovese","sequence":"first","affiliation":[{"name":"Institute for Informatics and Telematics, CNR, Pisa, Italy"}]},{"given":"Marco M","family":"Mosca","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Liverpool, Liverpool, UK"}]},{"given":"Marco","family":"Pellegrini","sequence":"additional","affiliation":[{"name":"Institute for Informatics and Telematics, CNR, Pisa, Italy"},{"name":"Laboratory of Integrative Systems Medicine (LISM), Institute of Informatics and Telematics and Institute of Clinical Physiology, Pisa, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6993-6761","authenticated-orcid":false,"given":"Filippo","family":"Geraci","sequence":"additional","affiliation":[{"name":"Institute for Informatics and Telematics, CNR, Pisa, Italy"}]}],"member":"286","published-online":{"date-parts":[[2018,8,28]]},"reference":[{"key":"2023013107261356200_bty747-B1","author":"Abajian","year":"1994"},{"key":"2023013107261356200_bty747-B2","doi-asserted-by":"crossref","first-page":"736","DOI":"10.1093\/humrep\/deh666","article-title":"Is the cag repeat of mitochondrial dna polymerase gamma (polg) associated with male infertility? A multi-centre french study","volume":"20","author":"Aknin-Seifer","year":"2005","journal-title":"Hum. Reprod"},{"key":"2023013107261356200_bty747-B3","doi-asserted-by":"crossref","first-page":"e29548.","DOI":"10.1371\/journal.pone.0029548","article-title":"Cag repeat variants in the polg1 gene encoding mtdna polymerase-gamma and risk of breast cancer in African-American women","volume":"7","author":"Azrak","year":"2012","journal-title":"PLoS One"},{"key":"2023013107261356200_bty747-B4","doi-asserted-by":"crossref","first-page":"1545","DOI":"10.1101\/gr.078303.108","article-title":"Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties","volume":"18","author":"Bacolla","year":"2008","journal-title":"Genome Res"},{"key":"2023013107261356200_bty747-B5","doi-asserted-by":"crossref","first-page":"573.","DOI":"10.1093\/nar\/27.2.573","article-title":"Tandem repeats finder: a program to analyze DNA sequences","volume":"27","author":"Benson","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B6","doi-asserted-by":"crossref","first-page":"676","DOI":"10.1093\/bioinformatics\/btk032","article-title":"Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression","volume":"22","author":"Boeva","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B7","doi-asserted-by":"crossref","first-page":"795.","DOI":"10.1186\/1471-2164-14-795","article-title":"Starrrt: a table of short tandem repeats in regulatory regions of the human genome","volume":"14","author":"Bolton","year":"2013","journal-title":"BMC Genomics"},{"key":"2023013107261356200_bty747-B8","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1016\/S0531-5131(03)01713-8","article-title":"Forensic value of the multicopy y-str marker dys464","volume":"1261","author":"Butler","year":"2004","journal-title":"Int. Congr. Ser"},{"key":"2023013107261356200_bty747-B9","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1038\/nrm2854","article-title":"Repeat instability as the basis for human diseases and as a potential target for therapy","volume":"11","author":"Castel","year":"2010","journal-title":"Nat. Rev. Mol. Cell Biol"},{"key":"2023013107261356200_bty747-B10","doi-asserted-by":"crossref","first-page":"634","DOI":"10.1093\/bioinformatics\/18.4.634","article-title":"Troll-tandem repeat occurrence locator","volume":"18","author":"Castelo","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B11","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1038\/sj.mp.4000353","article-title":"Isolation of a novel potassium channel gene hskca3 containing a polymorphic cag repeat: a candidate for schizophrenia and bipolar disorder?","volume":"3","author":"Chandy","year":"1998","journal-title":"Mol. Psychiatry"},{"key":"2023013107261356200_bty747-B12","doi-asserted-by":"crossref","first-page":"3173","DOI":"10.1093\/hmg\/ddg339","article-title":"Noradrenergic neuronal development is impaired by mutation of the proneural hash-1 gene in congenital central hypoventilation syndrome (ondine\u2019s curse)","volume":"12","author":"de Pontual","year":"2003","journal-title":"Hum. Mol. Genet"},{"key":"2023013107261356200_bty747-B13","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.neuron.2011.09.011","article-title":"Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS","volume":"72","author":"DeJesus-Hernandez","year":"2011","journal-title":"Neuron"},{"key":"2023013107261356200_bty747-B14","doi-asserted-by":"crossref","first-page":"2812","DOI":"10.1093\/bioinformatics\/bth335","article-title":"Star: an algorithm to search for tandem approximate repeats","volume":"20","author":"Delgrange","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B15","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1146\/annurev-genet-072610-155046","article-title":"Variable tandem repeats accelerate evolution of coding and regulatory sequences","volume":"44","author":"Gemayel","year":"2010","journal-title":"Annu. Rev. Genet"},{"key":"2023013107261356200_bty747-B16","doi-asserted-by":"crossref","first-page":"e22.","DOI":"10.1093\/nar\/gks881","article-title":"Msdetector: toward a standard computational tool for DNA microsatellites detection","volume":"41","author":"Girgis","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B17","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1002\/emmm.201100135","article-title":"A cag repeat polymorphism of kcnn3 predicts sk3 channel function and cognitive performance in schizophrenia","volume":"3","author":"Grube","year":"2011","journal-title":"EMBO Mol. Med"},{"key":"2023013107261356200_bty747-B18","doi-asserted-by":"crossref","first-page":"1154","DOI":"10.1101\/gr.135780.111","article-title":"lobstr: a short tandem repeat profiler for personal genomes","volume":"22","author":"Gymrek","year":"2012","journal-title":"Genome Res"},{"key":"2023013107261356200_bty747-B19","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1126\/science.1229566","article-title":"Identifying personal genomes by surname inference","volume":"339","author":"Gymrek","year":"2013","journal-title":"Science"},{"key":"2023013107261356200_bty747-B20","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/j.gene.2011.10.028","article-title":"Core promoter strs: novel mechanism for inter-individual variation in gene expression in humans","volume":"492","author":"Heidari","year":"2012","journal-title":"Gene"},{"key":"2023013107261356200_bty747-B21","doi-asserted-by":"crossref","first-page":"338.","DOI":"10.1038\/nbt.4060","article-title":"Nanopore sequencing and assembly of a human genome with ultra-long reads","volume":"36","author":"Jain","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023013107261356200_bty747-B22","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1007\/BF02715889","article-title":"Exact tandem repeats analyzer (e-tra): a new program for DNA sequence mining","volume":"84","author":"Karaca","year":"2005","journal-title":"J. Genet"},{"key":"2023013107261356200_bty747-B23","doi-asserted-by":"crossref","first-page":"D493","DOI":"10.1093\/nar\/gkh103","article-title":"The ucsc table browser data retrieval tool","volume":"32","author":"Karolchik","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B24","doi-asserted-by":"crossref","first-page":"1683","DOI":"10.1093\/bioinformatics\/btm157","article-title":"Sciroko: a new tool for whole genome microsatellite search and investigation","volume":"23","author":"Kofler","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B25","doi-asserted-by":"crossref","first-page":"3672","DOI":"10.1093\/nar\/gkg617","article-title":"mreps: efficient and flexible detection of tandem repeats in DNA","volume":"31","author":"Kolpakov","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B26","doi-asserted-by":"crossref","first-page":"2702","DOI":"10.1093\/bioinformatics\/bth311","article-title":"Exhaustive whole-genome tandem repeats search","volume":"20","author":"Krishnan","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B27","doi-asserted-by":"crossref","first-page":"4633","DOI":"10.1093\/nar\/29.22.4633","article-title":"Reputer: the manifold applications of repeat analysis on a genomic scale","volume":"29","author":"Kurtz","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B28","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1093\/bib\/bbs023","article-title":"Review of tandem repeat search tools: a systematic approach to evaluating algorithmic performance","volume":"14","author":"Lim","year":"2013","journal-title":"Brief. Bioinform"},{"key":"2023013107261356200_bty747-B29","author":"Mador-House","year":"2014"},{"key":"2023013107261356200_bty747-B30","doi-asserted-by":"crossref","first-page":"932","DOI":"10.1038\/nature05977","article-title":"Expandable DNA repeats and human disease","volume":"447","author":"Mirkin","year":"2007","journal-title":"Nature"},{"key":"2023013107261356200_bty747-B31","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1093\/bioinformatics\/btm097","article-title":"Imex: imperfect microsatellite extractor","volume":"23","author":"Mudunuri","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B32","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol"},{"key":"2023013107261356200_bty747-B33","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.gene.2012.07.001","article-title":"Evolutionary trend of exceptionally long human core promoter short tandem repeats","volume":"507","author":"Ohadi","year":"2012","journal-title":"Gene"},{"key":"2023013107261356200_bty747-B34","doi-asserted-by":"crossref","first-page":"1733","DOI":"10.1093\/bioinformatics\/btg268","article-title":"String: finding tandem repeats in DNA sequences","volume":"19","author":"Parisi","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B35","doi-asserted-by":"crossref","first-page":"i358","DOI":"10.1093\/bioinformatics\/btq209","article-title":"Trstalker: an efficient heuristic for finding fuzzy tandem repeats","volume":"26","author":"Pellegrini","year":"2010","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B36","doi-asserted-by":"crossref","first-page":"S3.","DOI":"10.1186\/1471-2105-13-S4-S3","article-title":"Tandem repeats discovery service (treads) applied to finding novel cis-acting factors in repeat expansion diseases","volume":"13","author":"Pellegrini","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023013107261356200_bty747-B37","doi-asserted-by":"crossref","first-page":"612.","DOI":"10.1186\/1471-2164-10-612","article-title":"Sequence determinants of human microsatellite variability","volume":"10","author":"Pemberton","year":"2009","journal-title":"BMC Genomics"},{"key":"2023013107261356200_bty747-B38","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1016\/j.ygeno.2010.08.001","article-title":"Bwtrs: a tool for searching for tandem repeats in DNA sequences based on the burrows\u2013wheeler transform","volume":"96","author":"Pokrzywa","year":"2010","journal-title":"Genomics"},{"key":"2023013107261356200_bty747-B39","first-page":"1","author":"Pop","year":"2015"},{"key":"2023013107261356200_bty747-B40","doi-asserted-by":"crossref","first-page":"e70.","DOI":"10.1371\/journal.pgen.0010070","article-title":"Clines, clusters, and the effect of study design on the inference of human population structure","volume":"1","author":"Rosenberg","year":"2005","journal-title":"PLoS Genet"},{"key":"2023013107261356200_bty747-B41","doi-asserted-by":"crossref","first-page":"320","DOI":"10.1093\/nar\/29.1.320","article-title":"Strbase: a short tandem repeat DNA database for the human identity testing community","volume":"29","author":"Ruitberg","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B42","doi-asserted-by":"crossref","first-page":"2284","DOI":"10.1093\/nar\/gkn064","article-title":"Empirical comparison of ab initio repeat finding programs","volume":"36","author":"Saha","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023013107261356200_bty747-B43","doi-asserted-by":"crossref","first-page":"544","DOI":"10.1016\/j.ajhg.2009.09.019","article-title":"Spinocerebellar ataxia type 31 is associated with \u201cinserted\u201d penta-nucleotide repeats containing (tggaa)n","volume":"85","author":"Sato","year":"2009","journal-title":"Am. J. Hum. Genet"},{"key":"2023013107261356200_bty747-B44","doi-asserted-by":"crossref","first-page":"e54710.","DOI":"10.1371\/journal.pone.0054710","article-title":"Microsatellite tandem repeats are abundant in human promoters and are associated with regulatory elements","volume":"8","author":"Sawaya","year":"2013","journal-title":"PLoS One"},{"key":"2023013107261356200_bty747-B45","author":"Smit","year":"2017"},{"key":"2023013107261356200_bty747-B46","doi-asserted-by":"crossref","first-page":"e30","DOI":"10.1093\/bioinformatics\/btl309","article-title":"Tandem repeats over the edit distance","volume":"23","author":"Sokol","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B47","doi-asserted-by":"crossref","first-page":"GC1","DOI":"10.1016\/0378-1119(95)00714-8","article-title":"A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis","volume":"167","author":"Sonnhammer","year":"1995","journal-title":"Gene"},{"key":"2023013107261356200_bty747-B48","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1093\/bioinformatics\/btg470","article-title":"Adplot: detection and visualization of repetitive patterns in complete genomes","volume":"20","author":"Taneda","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013107261356200_bty747-B49","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1007\/s00122-002-1031-0","article-title":"Exploiting est databases for the development and characterization of gene-derived ssr-markers in barley (hordeum vulgare l.)","volume":"106","author":"Thiel","year":"2003","journal-title":"Theor. Appl. Genet"},{"key":"2023013107261356200_bty747-B50","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1016\/j.neuron.2013.03.026","article-title":"CGG repeat-associated translation mediates neurodegeneration in fragile x tremor ataxia syndrome","volume":"78","author":"Todd","year":"2013","journal-title":"Neuron"},{"key":"2023013107261356200_bty747-B51","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1101\/gr.10.7.967","article-title":"Microsatellites in different eukaryotic genomes: survey and analysis","volume":"10","author":"T\u00f3th","year":"2000","journal-title":"Genome Res"},{"key":"2023013107261356200_bty747-B52","doi-asserted-by":"crossref","first-page":"2587","DOI":"10.1093\/emboj\/20.10.2587","article-title":"Replication slippage involves DNA polymerase pausing and dissociation","volume":"20","author":"Viguera","year":"2001","journal-title":"EMBO J"},{"key":"2023013107261356200_bty747-B53","doi-asserted-by":"crossref","first-page":"1213","DOI":"10.1126\/science.1170097","article-title":"Unstable tandem repeats in promoters confer transcriptional evolvability","volume":"324","author":"Vinces","year":"2009","journal-title":"Science"},{"key":"2023013107261356200_bty747-B54","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1089\/cmb.2005.12.928","article-title":"Finding approximate tandem repeats in genomic sequences","volume":"12","author":"Wexler","year":"2005","journal-title":"J. Comput. Biol"},{"key":"2023013107261356200_bty747-B55","doi-asserted-by":"crossref","first-page":"e49083.","DOI":"10.1371\/journal.pone.0049083","article-title":"A common trinucleotide repeat expansion within the transcription factor 4 (TCF4, E2-2) gene predicts Fuchs corneal dystrophy","volume":"7","author":"Wieben","year":"2012","journal-title":"PLoS One"},{"key":"2023013107261356200_bty747-B56","doi-asserted-by":"crossref","first-page":"590","DOI":"10.1038\/nmeth.4267","article-title":"Genome-wide profiling of heritable and de novo str variations","volume":"14","author":"Willems","year":"2017","journal-title":"Nat. Methods"},{"key":"2023013107261356200_bty747-B57","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1086\/510800","article-title":"Cgg-repeat expansion in the DIP2B gene is associated with the fragile site FRA12A on chromosome 12q13.1","volume":"80","author":"Winnepenninckx","year":"2007","journal-title":"Am. J. Hum. Genet"},{"key":"2023013107261356200_bty747-B58","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1007\/978-3-642-16750-8_14","volume-title":"Computational Systems-Biology and Bioinformatics","author":"Wirawan","year":"2010"},{"key":"2023013107261356200_bty747-B59","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1109\/TITB.2008.920626","article-title":"Detection of tandem repeats in DNA sequences based on parametric spectral estimation","volume":"13","author":"Zhou","year":"2009","journal-title":"IEEE Trans. Inf. Technol. Biomed"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/6\/914\/48966648\/bioinformatics_35_6_914.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/6\/914\/48966648\/bioinformatics_35_6_914.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T05:24:00Z","timestamp":1675142640000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/6\/914\/5085378"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,8,28]]},"references-count":59,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2019,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty747","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/240937","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,3,15]]},"published":{"date-parts":[[2018,8,28]]}}}