{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T19:59:10Z","timestamp":1773259150278,"version":"3.50.1"},"reference-count":19,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T00:00:00Z","timestamp":1614124800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,8,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>We discover that maximality of information content among intervals of Tandem Repeats (TRs) in animal genome segregates over taxa such that taxa identification becomes swift and accurate. Successive TRs of a motif occur at intervals over the sequence, forming a trail of TRs of the motif across the genome. We present a method, Tandem Repeat Information Mining (TRIM), that mines 4k number of TR trails of all k length motifs from a whole genome sequence and extracts the information content within intervals of the trails. TRIM vector formed from the ordered set of interval entropies becomes instrumental for genome segregation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Reconstruction of correct phylogeny for animals from whole genome sequences proves precision of TRIM. Identification of animal taxa by TRIM vector upon feature selection is the most significant achievement. These suggest Tandem Repeat Interval Pattern (TRIP) is a taxa-specific constitutional characteristic in animal genome.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availabilityand implementation<\/jats:title>\n                  <jats:p>Source and executable code of TRIM along with usage manual are made available at https:\/\/github.com\/BB-BiG\/TRIM.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab124","type":"journal-article","created":{"date-parts":[[2021,2,22]],"date-time":"2021-02-22T20:11:51Z","timestamp":1614024711000},"page":"2250-2258","source":"Crossref","is-referenced-by-count":2,"title":["Tandem repeat interval pattern identifies animal taxa"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7822-6470","authenticated-orcid":false,"given":"Balaram","family":"Bhattacharyya","sequence":"first","affiliation":[{"name":"Department of Computer and System Sciences, Visva-Bharati University , Santiniketan 731235, West Bengal, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5682-5218","authenticated-orcid":false,"given":"Uddalak","family":"Mitra","sequence":"additional","affiliation":[{"name":"Department of Computer and System Sciences, Visva-Bharati University , Santiniketan 731235, West Bengal, India"}]},{"given":"Ramkishore","family":"Bhattacharyya","sequence":"additional","affiliation":[{"name":"Saas, Oracle America Inc. , Bellevue, WA 98004, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,2,24]]},"reference":[{"key":"2023051609131672000_btab124-B2","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1093\/bioinformatics\/btx721","article-title":"PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences","volume":"34","author":"Avvaru","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051609131672000_btab124-B3","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1089\/cmb.2008.0013","article-title":"Spectrum-based de novo repeat detection in genomic sequences","volume":"15","author":"Do","year":"2008","journal-title":"J. Comput. Biol"},{"key":"2023051609131672000_btab124-B4","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1089\/cmb.2007.0018","article-title":"A novel approach to the detection of genomic approximate tandem repeats in the Levenshtein metric","volume":"14","author":"Domanic","year":"2007","journal-title":"J. Comput. Biol"},{"key":"2023051609131672000_btab124-B5","doi-asserted-by":"crossref","first-page":"3383","DOI":"10.1093\/nar\/gkm271","article-title":"Repeat-induced epigenetic changes in intron 1 of the frataxin gene and its consequences in Friedreich ataxia","volume":"35","author":"Greene","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023051609131672000_btab124-B6","doi-asserted-by":"crossref","first-page":"172089","DOI":"10.1098\/rsos.172089","article-title":"Forensic efficiency estimate and phylogenetic analysis for Chinese Kyrgyz ethnic group revealed by a panel of 21 short tandem repeats","volume":"5","author":"Guo","year":"2018","journal-title":"R. Soc. Open Sci"},{"key":"2023051609131672000_btab124-B7","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1038\/s41588-018-0067-2","article-title":"Expansions of intronic TTTCA and TTTTA repeats in benign adult familial myoclonic epilepsy","volume":"50","author":"Ishiura","year":"2018","journal-title":"Nat. Genet"},{"key":"2023051609131672000_btab124-B8","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1016\/S0168-9525(97)01008-1","article-title":"Simple sequence repeats as a source of quantitative genetic variation","volume":"13","author":"Kashi","year":"1997","journal-title":"Trends Genet"},{"key":"2023051609131672000_btab124-B9","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1214\/aoms\/1177729694","article-title":"On information and sufficiency","volume":"22","author":"Kullback","year":"1951","journal-title":"Ann. Math. Stat"},{"key":"2023051609131672000_btab124-B10","doi-asserted-by":"crossref","first-page":"1844","DOI":"10.1038\/ncomms2872","article-title":"GATA simple sequence repeats function as enhancer blocker boundaries","volume":"4","author":"Kumar","year":"2013","journal-title":"Nat. Commun"},{"key":"2023051609131672000_btab124-B11","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1007\/s00414-017-1740-1","article-title":"Genetic polymorphisms in 18 autosomal STR loci in the Tibetan population living in Tibet Chamdo, Southwest China","volume":"132","author":"Li","year":"2018","journal-title":"Int. J. Legal Med"},{"key":"2023051609131672000_btab124-B12","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1093\/bib\/bbs023","article-title":"Review of tandem repeat search tools: a systematic approach to evaluating algorithmic performance","volume":"14","author":"Lim","year":"2013","journal-title":"Brief. Bioinform"},{"key":"2023051609131672000_btab124-B13","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1109\/18.61115","article-title":"Divergence measures based on the Shannon entropy","volume":"37","author":"Lin","year":"1991","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023051609131672000_btab124-B14","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1038\/nrg3908","article-title":"Genetic linkage analysis in the age of whole-genome sequencing","volume":"16","author":"Ott","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2023051609131672000_btab124-B15","doi-asserted-by":"crossref","first-page":"10274","DOI":"10.1038\/s41598-019-46773-w","article-title":"Ultra-fast genome comparison for large-scale genomic experiments","volume":"9","author":"Perez-Wohlfeil","year":"2019","journal-title":"Sci. Rep"},{"key":"2023051609131672000_btab124-B16","doi-asserted-by":"crossref","first-page":"2707","DOI":"10.1093\/bioinformatics\/btw298","article-title":"SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences","volume":"32","author":"Pickett","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051609131672000_btab124-B17","doi-asserted-by":"crossref","first-page":"3922","DOI":"10.1093\/bioinformatics\/btx538","article-title":"Kmer-SSR: a fast and exhaustive SSR search algorithm","volume":"33","author":"Pickett","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051609131672000_btab124-B18","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1186\/1471-2105-13-174","article-title":"A novel hierarchical clustering algorithm for gene sequences","volume":"13","author":"Wei","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023051609131672000_btab124-B19","volume-title":"Computational Systems-Biology and Bioinformatics. CSBio 2010. Communications in Computer and Information Science","author":"Wirawan","year":"2010"},{"key":"2023051609131672000_btab124-B20","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1006\/geno.1994.1151","article-title":"Genome fingerprinting by simple sequence repeat (Ssr)-anchored polymerase chain-reaction amplification","volume":"20","author":"Zietkiewicz","year":"1994","journal-title":"Genomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab124\/36587578\/btab124.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/16\/2250\/50339414\/btab124.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/16\/2250\/50339414\/btab124.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T09:16:15Z","timestamp":1684228575000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/16\/2250\/6149036"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,2,24]]},"references-count":19,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2021,8,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab124","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,8,15]]},"published":{"date-parts":[[2021,2,24]]}}}