{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:38:42Z","timestamp":1740123522980,"version":"3.37.3"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,8,25]],"date-time":"2021-08-25T00:00:00Z","timestamp":1629849600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,8,25]],"date-time":"2021-08-25T00:00:00Z","timestamp":1629849600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia e Innovaci\u00f3n","doi-asserted-by":"crossref","award":["PID2019-104184RB-I00 \/ AEI \/ 10.13039\/501100011033"],"award-info":[{"award-number":["PID2019-104184RB-I00 \/ AEI \/ 10.13039\/501100011033"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100010801","name":"Xunta de Galicia","doi-asserted-by":"publisher","award":["ED431G 2019\/01","ED431C 2021\/30"],"award-info":[{"award-number":["ED431G 2019\/01","ED431C 2021\/30"]}],"id":[{"id":"10.13039\/501100010801","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100014597","name":"Universidade da Coru\u00f1a","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100014597","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2022,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related with certain human diseases. Therefore, identifying and classifying TRs have become a highly important task in bioinformatics in order to analyze their disorders and relationships with illnesses. <jats:italic>Dot2dot<\/jats:italic>, a tool recently developed to find TRs, provides more accurate results than the previous state-of-the-art, but it requires a long execution time even when using multiple threads. This work presents <jats:italic>MPI-dot2dot<\/jats:italic>, a novel version of this tool that combines MPI and OpenMP so that it can be executed in a cluster of multicore nodes and thus reduces its execution time. The performance of this new parallel implementation has been tested using different real datasets. Depending on the characteristics of the input genomes, it is able to obtain the same biological results as <jats:italic>Dot2dot<\/jats:italic> but more than 100 times faster on a 16-node multicore cluster (384 cores). <jats:italic>MPI-dot2dot<\/jats:italic> is publicly available to download from <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/sourceforge.net\/projects\/mpi-dot2dot\">https:\/\/sourceforge.net\/projects\/mpi-dot2dot<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s11227-021-04025-7","type":"journal-article","created":{"date-parts":[[2021,8,25]],"date-time":"2021-08-25T15:02:43Z","timestamp":1629903763000},"page":"4217-4235","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["MPI-dot2dot: A parallel tool to find DNA tandem repeats on multicore clusters"],"prefix":"10.1007","volume":"78","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2602-4874","authenticated-orcid":false,"given":"Jorge","family":"Gonz\u00e1lez-Dom\u00ednguez","sequence":"first","affiliation":[]},{"given":"Jos\u00e9 M.","family":"Mart\u00edn-Mart\u00ednez","sequence":"additional","affiliation":[]},{"given":"Roberto R.","family":"Exp\u00f3sito","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,8,25]]},"reference":[{"key":"4025_CR1","unstructured":"Message Passing Interface Forum. MPI: A Message-Passing Interface Standard Version 3.1 (2015). [Online] Available: http:\/\/www.mpi-forum.org\/docs\/mpi-3.1\/mpi31-report.pdf"},{"issue":"6","key":"4025_CR2","doi-asserted-by":"publisher","first-page":"943","DOI":"10.1093\/bioinformatics\/btx721","volume":"34","author":"AK Avvaru","year":"2018","unstructured":"Avvaru AK, Sowpati DT, Mishra RK (2018) PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences. Bioinformatics 34(6):943\u2013948","journal-title":"Bioinformatics"},{"issue":"D1","key":"4025_CR3","doi-asserted-by":"publisher","first-page":"D36","DOI":"10.1093\/nar\/gks1195","volume":"41","author":"DA Benson","year":"2012","unstructured":"Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2012) GenBank. Nucleic Acids Research 41(D1):D36\u2013D42","journal-title":"Nucleic Acids Research"},{"issue":"2","key":"4025_CR4","doi-asserted-by":"publisher","first-page":"573","DOI":"10.1093\/nar\/27.2.573","volume":"27","author":"G Benson","year":"1999","unstructured":"Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27(2):573\u2013580","journal-title":"Nucleic Acids Research"},{"issue":"6","key":"4025_CR5","doi-asserted-by":"publisher","first-page":"676","DOI":"10.1093\/bioinformatics\/btk032","volume":"22","author":"V Boeva","year":"2006","unstructured":"Boeva V, Regnier M, Papatsenko D, Makeev V (2006) Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression. Bioinformatics 22(6):676\u2013684","journal-title":"Bioinformatics"},{"issue":"4","key":"4025_CR6","doi-asserted-by":"publisher","first-page":"634","DOI":"10.1093\/bioinformatics\/18.4.634","volume":"18","author":"AT Castelo","year":"2002","unstructured":"Castelo AT, Martins W, Gao GR (2002) TROLL-tandem repeat occurrence locator. Bioinformatics 18(4):634\u2013636","journal-title":"Bioinformatics"},{"issue":"1","key":"4025_CR7","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1109\/99.660313","volume":"5","author":"L Dagum","year":"1998","unstructured":"Dagum L, Menon R (1998) OpenMP: an industry standard API for shared-memory programming. Comput Sci Eng IEEE 5(1):46\u201355","journal-title":"Comput Sci Eng IEEE"},{"issue":"1","key":"4025_CR8","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1186\/s13059-019-1856-3","volume":"20","author":"A De Roeck","year":"2019","unstructured":"De Roeck A, De Coster W, Bossaerts L, Cacace R, De Pooter T, Van Dongen J, D\u2019Hert S, De Rijk P, Strazisar M, Van Broeckhoven C et al (2019) NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION. Genome Biol 20(1):239","journal-title":"Genome Biol"},{"issue":"16","key":"4025_CR9","doi-asserted-by":"publisher","first-page":"2812","DOI":"10.1093\/bioinformatics\/bth335","volume":"20","author":"O Delgrange","year":"2004","unstructured":"Delgrange O, Rivals E (2004) STAR: an algorithm to search for tandem approximate repeats. Bioinformatics 20(16):2812\u20132820","journal-title":"Bioinformatics"},{"issue":"7583","key":"4025_CR10","doi-asserted-by":"publisher","first-page":"585","DOI":"10.1038\/nature16191","volume":"528","author":"L Doyle","year":"2015","unstructured":"Doyle L, Hallinan J, Bolduc J, Parmeggiani F, Baker D, Stoddard BL, Bradley P (2015) Rational design of $$\\alpha $$-helical tandem repeat proteins with closed architectures. Nature 528(7583):585\u2013588","journal-title":"Nature"},{"key":"4025_CR11","unstructured":"Galician Supercomputing Center: CESGA. [Online] Available: https:\/\/www.cesga.es. Last visited: August 2021"},{"issue":"6","key":"4025_CR12","doi-asserted-by":"publisher","first-page":"914","DOI":"10.1093\/bioinformatics\/bty747","volume":"35","author":"LM Genovese","year":"2019","unstructured":"Genovese LM, Mosca MM, Pellegrini M, Geraci F (2019) Dot2dot: accurate whole-genome tandem repeats discovery. Bioinformatics 35(6):914\u2013922","journal-title":"Bioinformatics"},{"issue":"1","key":"4025_CR13","doi-asserted-by":"publisher","first-page":"e22","DOI":"10.1093\/nar\/gks881","volume":"41","author":"HZ Girgis","year":"2013","unstructured":"Girgis HZ, Sheetlin SL (2013) MsDetector: toward a standard computational tool for DNA microsatellites detection. Nucleic Acids Research 41(1):e22\u2013e22","journal-title":"Nucleic Acids Research"},{"issue":"2","key":"4025_CR14","doi-asserted-by":"publisher","first-page":"216","DOI":"10.2174\/1574893612666170529120424","volume":"13","author":"S Gupta","year":"2018","unstructured":"Gupta S, Prasad R (2018) Searching exact tandem repeats in DNA sequences using enhanced suffix array. Curr Bioinformat 13(2):216\u2013222","journal-title":"Curr Bioinformat"},{"issue":"5","key":"4025_CR15","doi-asserted-by":"publisher","first-page":"286","DOI":"10.1038\/nrg.2017.115","volume":"19","author":"AJ Hannan","year":"2018","unstructured":"Hannan AJ (2018) Tandem repeats mediating genetic plasticity in health and disease. Nat Rev Genet 19(5):286","journal-title":"Nat Rev Genet"},{"issue":"22","key":"4025_CR16","doi-asserted-by":"publisher","first-page":"4809","DOI":"10.1093\/bioinformatics\/btz484","volume":"35","author":"RS Harris","year":"2019","unstructured":"Harris RS, Cechova M, Makova KD (2019) Noise-cancelling repeat finder: uncovering tandem repeats in error-prone long-read sequencing data. Bioinformatics 35(22):4809\u20134811","journal-title":"Bioinformatics"},{"key":"4025_CR17","doi-asserted-by":"crossref","unstructured":"Kinkar L, Korhonen PK, Cai H, Gauci CG, Lightowlers MW, Saarma U, Jenkins DJ, Li J, Li J, Young ND et\u00a0al (2019) Long-Read Sequencing Reveals a 4.4 kb Tandem Repeat Region in the Mitogenome of Echinococcus Granulosus (sensu stricto) Genotype G1. Parasites & Vectors 12(1), 1\u20137","DOI":"10.1186\/s13071-019-3492-x"},{"issue":"13","key":"4025_CR18","doi-asserted-by":"publisher","first-page":"3672","DOI":"10.1093\/nar\/gkg617","volume":"31","author":"R Kolpakov","year":"2003","unstructured":"Kolpakov R, Bana G, Kucherov G (2003) mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Research 31(13):3672\u20133678","journal-title":"Nucleic Acids Research"},{"issue":"6330","key":"4025_CR19","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1038\/352077a0","volume":"352","author":"AR La Spada","year":"1991","unstructured":"La Spada AR, Wilson EM, Lubahn DB, Harding A, Fischbeck KH (1991) Androgen receptor gene mutations in X-linked spinal and bulbar muscular atrophy. Nature 352(6330):77\u201379","journal-title":"Nature"},{"issue":"13","key":"4025_CR20","doi-asserted-by":"publisher","first-page":"4685","DOI":"10.3390\/ijms21134685","volume":"21","author":"Z Li","year":"2020","unstructured":"Li Z, Li M, Xu S, Liu L, Chen Z, Zou K (2020) Complete mitogenomes of three carangidae (perciformes) fishes: genome description and phylogenetic considerations. Int J Mol Sci 21(13):4685","journal-title":"Int J Mol Sci"},{"issue":"1","key":"4025_CR21","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1093\/bib\/bbs023","volume":"14","author":"KG Lim","year":"2013","unstructured":"Lim KG, Kwoh CK, Hsu LY, Wirawan A (2013) Review of tandem repeat search tools: a systematic approach to evaluating algorithmic performance. Brief Bioinformat 14(1):67\u201381","journal-title":"Brief Bioinformat"},{"key":"4025_CR22","doi-asserted-by":"crossref","unstructured":"Mart\u00ednek T, Lexa M (2010) Hardware acceleration of approximate tandem repeat detection. In: proceedings of the 2010 18th IEEE annual international symposium on field-programmable custom computing machines (FCCM \u201910), pp. 79\u201386","DOI":"10.1109\/FCCM.2010.21"},{"issue":"11","key":"4025_CR23","doi-asserted-by":"publisher","first-page":"a036798","DOI":"10.1101\/cshperspect.a036798","volume":"9","author":"WR McCombie","year":"2019","unstructured":"McCombie WR, McPherson JD, Mardis ER (2019) Next-generation sequencing technologies. Cold Spring Harbor Perspect Med 9(11):a036798","journal-title":"Cold Spring Harbor Perspect Med"},{"issue":"5","key":"4025_CR24","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1093\/bib\/bbn028","volume":"9","author":"A Merkel","year":"2008","unstructured":"Merkel A, Gemmell N (2008) Detecting short tandem repeats from genome data: opening the software black box. Brief Bioinformat 9(5):355\u2013366","journal-title":"Brief Bioinformat"},{"key":"4025_CR25","unstructured":"Nichols B, Buttlar D, Farrell JP (1996)  Pthreads Programming: A POSIX Standard for Better Multiprocessing, vol. 19"},{"issue":"12","key":"4025_CR26","doi-asserted-by":"publisher","first-page":"e111","DOI":"10.1093\/nar\/gkx257","volume":"45","author":"P Nov\u00e1k","year":"2017","unstructured":"Nov\u00e1k P, \u00c1vila Robledillo L, Kobl\u00ed\u017ekov\u00e1 A, Vrbov\u00e1 I, Neumann P, Macas J (2017) TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Research 45(12):e111\u2013e111","journal-title":"Nucleic Acids Research"},{"key":"4025_CR27","doi-asserted-by":"crossref","unstructured":"Olson D, Wheeler T (2018) ULTRA: a model based tool to detect tandem repeats. In: proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics (BCB \u201918), pp. 37\u201346","DOI":"10.1145\/3233547.3233604"},{"issue":"12","key":"4025_CR28","doi-asserted-by":"publisher","first-page":"i358","DOI":"10.1093\/bioinformatics\/btq209","volume":"26","author":"M Pellegrini","year":"2010","unstructured":"Pellegrini M, Renda ME, Vecchio A (2010) TRStalker: an efficient heuristic for finding fuzzy tandem repeats. Bioinformatics 26(12):i358\u2013i366","journal-title":"Bioinformatics"},{"issue":"5","key":"4025_CR29","doi-asserted-by":"publisher","first-page":"316","DOI":"10.1016\/j.ygeno.2010.08.001","volume":"96","author":"R Pokrzywa","year":"2010","unstructured":"Pokrzywa R, Polanski A (2010) BWtrs: a tool for searching for tandem repeats in DNA sequences based on the burrows-wheeler transform. Genomics 96(5):316\u2013321","journal-title":"Genomics"},{"key":"4025_CR30","doi-asserted-by":"crossref","unstructured":"Samsi S, Helfer B, Kepner J, Reuther A, Ricke DO (2017) A linear algebra approach to fast DNA mixture analysis using GPUs. In: proceedings of the 2017 IEEE high performance extreme computing conference (HPEC \u201917), pp. 1\u20136","DOI":"10.1109\/HPEC.2017.8091027"},{"key":"4025_CR31","doi-asserted-by":"crossref","unstructured":"Savari, H., Hadiniya, N., Savadi, A., Naghibzadeh, M.: Microsatellite Finder Algorithm with High Memory Efficiency for Even Super Long Sequences. In: Proceedings of the 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 1\u20135 (2020)","DOI":"10.1109\/ICCKE50421.2020.9303640"},{"issue":"3","key":"4025_CR32","doi-asserted-by":"publisher","first-page":"421","DOI":"10.1016\/j.ajhg.2018.07.011","volume":"103","author":"JH Song","year":"2018","unstructured":"Song JH, Lowe CB, Kingsley DM (2018) Characterization of a human-specific tandem repeat associated with bipolar disorder and schizophrenia. Am J Human Gen 103(3):421\u2013430","journal-title":"Am J Human Gen"},{"issue":"7827","key":"4025_CR33","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1038\/s41586-020-2579-z","volume":"586","author":"B Trost","year":"2020","unstructured":"Trost B, Engchuan W, Nguyen CM, Thiruvahindrapuram B, Dolzhenko E, Backstrom I, Mirceta M, Mojarad BA, Yin Y, Dov A et al (2020) Genome-wide detection of tandem DNA repeats that are expanded in Autism. Nature 586(7827):80\u201386","journal-title":"Nature"},{"issue":"7","key":"4025_CR34","doi-asserted-by":"publisher","first-page":"1011","DOI":"10.1101\/gr.070409.107","volume":"18","author":"K Usdin","year":"2008","unstructured":"Usdin K (2008) The biological effects of simple tandem repeats: lessons from the repeat expansion diseases. Genome Research 18(7):1011\u20131019","journal-title":"Genome Research"},{"key":"4025_CR35","doi-asserted-by":"crossref","unstructured":"Voet AR, Simoncini D, Tame JR, Zhang KY (2017) Evolution-inspired computational design of symmetric proteins. In: Computational Protein Design, pp. 309\u2013322. Springer","DOI":"10.1007\/978-1-4939-6637-0_16"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-021-04025-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-021-04025-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-021-04025-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,2,7]],"date-time":"2022-02-07T13:26:51Z","timestamp":1644240411000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-021-04025-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,25]]},"references-count":35,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,2]]}},"alternative-id":["4025"],"URL":"https:\/\/doi.org\/10.1007\/s11227-021-04025-7","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"type":"print","value":"0920-8542"},{"type":"electronic","value":"1573-0484"}],"subject":[],"published":{"date-parts":[[2021,8,25]]},"assertion":[{"value":"13 August 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 August 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}