{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T14:40:39Z","timestamp":1775054439978,"version":"3.50.1"},"reference-count":18,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2019,7,16]],"date-time":"2019-07-16T00:00:00Z","timestamp":1563235200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001459","name":"Singapore Ministry of Education","doi-asserted-by":"crossref","award":["MOE2015-T2-2-086"],"award-info":[{"award-number":["MOE2015-T2-2-086"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Algorithms"],"published-print":{"date-parts":[[2019,7,31]]},"abstract":"<jats:p>Tandem duplication in DNA is the process of inserting a copy of a segment of DNA adjacent to the original position. Motivated by applications that store data in living organisms, Jain et al. (2016) proposed the study of codes that correct tandem duplications to improve the reliability of data storage. We investigate algorithms associated with the study of these codes.<\/jats:p>\n          <jats:p>\n            Two words are said to be \u2a7d-confusable if there exists a sequence of tandem duplications for each word, where each duplication is of length at most\n            <jats:italic>k<\/jats:italic>\n            , such that the resulting two words after duplications are equal. For\n            <jats:italic>k<\/jats:italic>\n            =3, we demonstrate that the problem of deciding whether two words is \u2a7d3-confusable is linear-time solvable through a characterisation that can be checked efficiently. Combining with previous results, the decision problem is linear-time solvable for\n            <jats:italic>k<\/jats:italic>\n            \u2a7d 3. We conjecture that this problem is undecidable for\n            <jats:italic>k<\/jats:italic>\n            &gt; 3.\n          <\/jats:p>\n          <jats:p>Using insights gained from the algorithm, we study the size of tandem-duplication codes. We improve the previous known upper bound and then construct codes with larger sizes as compared to the previous constructions. We determine the sizes of optimal tandem-duplication codes for lengths up to 20, develop recursive methods to construct tandem-duplication codes for all word lengths, and compute explicit lower bounds for the size of optimal tandem-duplication codes for lengths from 21 to 30.<\/jats:p>","DOI":"10.1145\/3338514","type":"journal-article","created":{"date-parts":[[2019,7,16]],"date-time":"2019-07-16T12:39:01Z","timestamp":1563280741000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Deciding the Confusability of Words under Tandem Repeats in Linear Time"],"prefix":"10.1145","volume":"15","author":[{"given":"Yeow Meng","family":"Chee","sequence":"first","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Johan","family":"Chrisnata","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Nanyang Link, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5611-0848","authenticated-orcid":false,"given":"Han Mao","family":"Kiah","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Nanyang Link, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tuan Thanh","family":"Nguyen","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Nanyang Link, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,7,16]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1021\/bp049917i"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2015.2505735"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0408118101"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-8-176"},{"key":"e_1_2_1_5_1","volume-title":"Combinatorics of Compositions and Words","author":"Heubach Silvia","unstructured":"Silvia Heubach and Toufik Mansour . 2009. Combinatorics of Compositions and Words . CRC Press . Silvia Heubach and Toufik Mansour. 2009. Combinatorics of Compositions and Words. CRC Press."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2017.2728079"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2017.2688361"},{"key":"e_1_2_1_8_1","volume-title":"An improved branch and bound algorithm for the maximum clique problem. Proteins 4, 5","author":"Konc Janez","year":"2007","unstructured":"Janez Konc and Du\u0161anka Janezic . 2007. An improved branch and bound algorithm for the maximum clique problem. Proteins 4, 5 ( 2007 ). Retrieved from http:\/\/insilab.org\/maxclique\/. Janez Konc and Du\u0161anka Janezic. 2007. An improved branch and bound algorithm for the maximum clique problem. Proteins 4, 5 (2007). Retrieved from http:\/\/insilab.org\/maxclique\/."},{"key":"e_1_2_1_9_1","volume-title":"William FitzHugh et al","author":"Lander Eric S.","year":"2001","unstructured":"Eric S. Lander , Lauren M. Linton , Bruce Birren , Chad Nusbaum , Michael C. Zody , Jennifer Baldwin , Keri Devon , Ken Dewar , Michael Doyle , William FitzHugh et al . 2001 . Initial sequencing and analysis of the human genome. Nature 409, 6822 (2001), 860--921. Eric S. Lander, Lauren M. Linton, Bruce Birren, Chad Nusbaum, Michael C. Zody, Jennifer Baldwin, Keri Devon, Ken Dewar, Michael Doyle, William FitzHugh et al. 2001. Initial sequencing and analysis of the human genome. Nature 409, 6822 (2001), 860--921."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/1704840.1705009"},{"key":"e_1_2_1_11_1","volume-title":"Sempere","author":"Leupold Peter","year":"2003","unstructured":"Peter Leupold , Victor Mitrana , and Jos\u00e9 M . Sempere . 2003 . Formal languages arising from gene repeated duplication. In Aspects of Molecular Computing. Springer , 297--308. Peter Leupold, Victor Mitrana, and Jos\u00e9 M. Sempere. 2003. Formal languages arising from gene repeated duplication. In Aspects of Molecular Computing. Springer, 297--308."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0042465"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00239-004-2619-6"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/375360.375365"},{"key":"e_1_2_1_15_1","volume-title":"Church","author":"Shipman Seth L.","year":"2017","unstructured":"Seth L. Shipman , Jeff Nivala , Jeffrey D. Macklis , and George M . Church . 2017 . CRISPR--Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 547, 7663 (2017), 345. Seth L. Shipman, Jeff Nivala, Jeffrey D. Macklis, and George M. Church. 2017. CRISPR--Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 547, 7663 (2017), 345."},{"key":"e_1_2_1_16_1","volume-title":"Introduction to the Theory of Computation","author":"Sipser Michael","unstructured":"Michael Sipser . 2006. Introduction to the Theory of Computation , vol. 2 . Thomson Course Technology , Boston . Michael Sipser. 2006. Introduction to the Theory of Computation, vol. 2. Thomson Course Technology, Boston."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.92.9.3636"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1101\/gr.070409.107"}],"container-title":["ACM Transactions on Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3338514","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3338514","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:44:46Z","timestamp":1750203886000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3338514"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,16]]},"references-count":18,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2019,7,31]]}},"alternative-id":["10.1145\/3338514"],"URL":"https:\/\/doi.org\/10.1145\/3338514","relation":{},"ISSN":["1549-6325","1549-6333"],"issn-type":[{"value":"1549-6325","type":"print"},{"value":"1549-6333","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,7,16]]},"assertion":[{"value":"2018-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-07-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}