{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T12:14:17Z","timestamp":1768997657169,"version":"3.49.0"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2024,7,20]],"date-time":"2024-07-20T00:00:00Z","timestamp":1721433600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,20]],"date-time":"2024-07-20T00:00:00Z","timestamp":1721433600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61628207"],"award-info":[{"award-number":["61628207"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Acta Informatica"],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivated by computing duplication patterns in sequences, a new problem called the longest letter-duplicated subsequence (LLDS) is proposed. Given a sequence <jats:italic>S<\/jats:italic> of length <jats:italic>n<\/jats:italic>, a letter-duplicated subsequence is a subsequence of <jats:italic>S<\/jats:italic> in the form of <jats:inline-formula><jats:alternatives><jats:tex-math>$$x_1^{d_1}x_2^{d_2}\\ldots x_k^{d_k}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msubsup>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mn>1<\/mml:mn>\n                      <mml:msub>\n                        <mml:mi>d<\/mml:mi>\n                        <mml:mn>1<\/mml:mn>\n                      <\/mml:msub>\n                    <\/mml:msubsup>\n                    <mml:msubsup>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mn>2<\/mml:mn>\n                      <mml:msub>\n                        <mml:mi>d<\/mml:mi>\n                        <mml:mn>2<\/mml:mn>\n                      <\/mml:msub>\n                    <\/mml:msubsup>\n                    <mml:mo>\u2026<\/mml:mo>\n                    <mml:msubsup>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mi>k<\/mml:mi>\n                      <mml:msub>\n                        <mml:mi>d<\/mml:mi>\n                        <mml:mi>k<\/mml:mi>\n                      <\/mml:msub>\n                    <\/mml:msubsup>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> with <jats:inline-formula><jats:alternatives><jats:tex-math>$$x_i\\in \\Sigma $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msub>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mi>i<\/mml:mi>\n                    <\/mml:msub>\n                    <mml:mo>\u2208<\/mml:mo>\n                    <mml:mi>\u03a3<\/mml:mi>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>, <jats:inline-formula><jats:alternatives><jats:tex-math>$$x_j\\ne x_{j+1}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msub>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mi>j<\/mml:mi>\n                    <\/mml:msub>\n                    <mml:mo>\u2260<\/mml:mo>\n                    <mml:msub>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mrow>\n                        <mml:mi>j<\/mml:mi>\n                        <mml:mo>+<\/mml:mo>\n                        <mml:mn>1<\/mml:mn>\n                      <\/mml:mrow>\n                    <\/mml:msub>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> and <jats:inline-formula><jats:alternatives><jats:tex-math>$$d_i\\ge 2$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msub>\n                      <mml:mi>d<\/mml:mi>\n                      <mml:mi>i<\/mml:mi>\n                    <\/mml:msub>\n                    <mml:mo>\u2265<\/mml:mo>\n                    <mml:mn>2<\/mml:mn>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> for all <jats:italic>i<\/jats:italic> in [<jats:italic>k<\/jats:italic>] and <jats:italic>j<\/jats:italic> in <jats:inline-formula><jats:alternatives><jats:tex-math>$$[k-1]$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mo>[<\/mml:mo>\n                    <mml:mi>k<\/mml:mi>\n                    <mml:mo>-<\/mml:mo>\n                    <mml:mn>1<\/mml:mn>\n                    <mml:mo>]<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. A linear time algorithm for computing a longest letter-duplicated subsequence (LLDS) of <jats:italic>S<\/jats:italic> can be easily obtained. In this paper, we focus on two variants of this problem: (1) \u2018all-appearance\u2019 version, i.e., all letters in <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\Sigma $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u03a3<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> must appear in the solution, and (2) the weighted version. For the former, we obtain dichotomous results: We prove that, when each letter appears in <jats:italic>S<\/jats:italic> at least 4 times, the problem and a relaxed version on feasibility testing (FT) are both NP-hard. The reduction is from <jats:inline-formula><jats:alternatives><jats:tex-math>$$(3^+,1,2^-)$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:msup>\n                      <mml:mn>3<\/mml:mn>\n                      <mml:mo>+<\/mml:mo>\n                    <\/mml:msup>\n                    <mml:mo>,<\/mml:mo>\n                    <mml:mn>1<\/mml:mn>\n                    <mml:mo>,<\/mml:mo>\n                    <mml:msup>\n                      <mml:mn>2<\/mml:mn>\n                      <mml:mo>-<\/mml:mo>\n                    <\/mml:msup>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>-SAT, where all 3-clauses (i.e., containing 3\u00a0lals) are monotone (i.e., containing only positive literals) and all 2-clauses contain only negative literals. We then show that when each letter appears in <jats:italic>S<\/jats:italic> at most 3 times, then the problem admits an <jats:italic>O<\/jats:italic>(<jats:italic>n<\/jats:italic>) time algorithm. Finally, we consider the weighted version, where the weight of a block <jats:inline-formula><jats:alternatives><jats:tex-math>$$x_i^{d_i} (d_i\\ge 2)$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msubsup>\n                      <mml:mi>x<\/mml:mi>\n                      <mml:mi>i<\/mml:mi>\n                      <mml:msub>\n                        <mml:mi>d<\/mml:mi>\n                        <mml:mi>i<\/mml:mi>\n                      <\/mml:msub>\n                    <\/mml:msubsup>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:msub>\n                        <mml:mi>d<\/mml:mi>\n                        <mml:mi>i<\/mml:mi>\n                      <\/mml:msub>\n                      <mml:mo>\u2265<\/mml:mo>\n                      <mml:mn>2<\/mml:mn>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> could be any positive function which might not grow with <jats:inline-formula><jats:alternatives><jats:tex-math>$$d_i$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msub>\n                    <mml:mi>d<\/mml:mi>\n                    <mml:mi>i<\/mml:mi>\n                  <\/mml:msub>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. We give a non-trivial <jats:inline-formula><jats:alternatives><jats:tex-math>$$O(n^2)$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>n<\/mml:mi>\n                      <mml:mn>2<\/mml:mn>\n                    <\/mml:msup>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> time dynamic programming algorithm for this version, i.e., computing an LD-subsequence of <jats:italic>S<\/jats:italic> whose weight is maximized.<\/jats:p>","DOI":"10.1007\/s00236-024-00459-7","type":"journal-article","created":{"date-parts":[[2024,7,20]],"date-time":"2024-07-20T04:01:58Z","timestamp":1721448118000},"page":"315-329","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["The longest letter-duplicated subsequence and related problems"],"prefix":"10.1007","volume":"61","author":[{"given":"Wenfeng","family":"Lai","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Adiesha","family":"Liyanage","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Binhai","family":"Zhu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peng","family":"Zou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,7,20]]},"reference":[{"key":"459_CR1","unstructured":"Asahiro, Y., Eto, H., Gong, M., Jansson, J., Lin, G., Miyano, E., Ono, H., Tanaka, S.: Approximation algorithms for the longest run subsequence problem. In: Bulteau, Lipt\u00e1 Z. (eds) 34th annual symposium on combinatorial pattern matching, CPM 2023, June 26\u201328, 2023, Marne-la-Vall\u00e9e, France, volume 259 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik, pp. 2(1\u20132), 12. (2023)"},{"issue":"3","key":"459_CR2","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1016\/0020-0190(92)90050-6","volume":"44","author":"DP Bovet","year":"1992","unstructured":"Bovet, D.P., Varricchio, S.: On the regularity of languages on a binary alphabet generated by copying systems. Inf. Process. Lett. 44(3), 119\u2013123 (1992)","journal-title":"Inf. Process. Lett."},{"key":"459_CR3","doi-asserted-by":"crossref","unstructured":"Cicalese, F., Pilati, N.: The tandem duplication distance problem is hard over bounded alphabets. In: Paola, F., Lucia M. (eds.) Combinatorial algorithms - 21st international workshop, IWOCA 2021, Ottawa, Canada, July 5\u20137, 2021, volume 12757 of Lecture notes in computer science, pp. 179\u2013193. Springer (2021)","DOI":"10.1007\/978-3-030-79987-8_13"},{"key":"459_CR4","doi-asserted-by":"publisher","first-page":"1127","DOI":"10.1038\/ng.2762","volume":"45","author":"G Ciriello","year":"2013","unstructured":"Ciriello, G., Miller, M.L., Aksoy, B.A., Senbabaoglu, Y., Schultz, N., Sander, C.: Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127\u20131133 (2013)","journal-title":"Nat. Genet."},{"key":"459_CR5","first-page":"133","volume":"69","author":"J Dassow","year":"1999","unstructured":"Dassow, J., Mitrana, V., Paun, G.: On the regularity of the duplication closure. Bull. EATCS 69, 133\u2013136 (1999)","journal-title":"Bull. EATCS"},{"key":"459_CR6","unstructured":"Dondi, R., Sikora, F.: The longest run subsequence problem: further complexity results. In: Gawrychowski, P., Starikovskaya, T. (eds.) 32nd annual symposium on combinatorial pattern matching, CPM 2021, July 5\u20137, 2021, Wroc\u0142aw, Poland, volume 191 of LIPIcs, pp. 14(1\u201314), 15. Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik (2021)"},{"issue":"3","key":"459_CR7","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1016\/0166-218X(84)90129-X","volume":"8","author":"A Ehrenfeucht","year":"1984","unstructured":"Ehrenfeucht, A., Rozenberg, G.: On regularity of languages generated by copying systems. Discret. Appl. Math. 8(3), 313\u2013317 (1984)","journal-title":"Discret. Appl. Math."},{"key":"459_CR8","doi-asserted-by":"crossref","unstructured":"Kosowski, A.: An efficient algorithm for the longest tandem scattered subsequence problem. In: Apostolico, A., Melucci, M. (eds.) String processing and information retrieval, 11th international conference, SPIRE 2004, Padova, Italy, October 5\u20138, 2004, proceedings, volume 3246 of Lecture notes in computer science, pp. 93\u2013100. Springer (2004)","DOI":"10.1007\/978-3-540-30213-1_13"},{"key":"459_CR9","unstructured":"Lafond, M., Zhu, B., Zou, P.: The tandem duplication distance is NP-hard. In: Paul, C., Bl\u00e4ser, M. (eds.) 37th international symposium on theoretical aspects of computer science, STACS 2020, March 10\u201313, 2020, Montpellier, France, volume 154 of LIPIcs, pp. 15(1\u201315), 15. Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik (2020)"},{"key":"459_CR10","doi-asserted-by":"crossref","unstructured":"Lafond, M., Lai, W., Liyanage, A., Zhu, B.: The longest subsequence-repeated subsequence problem. In: Wu, W., Guo, J. (eds.) Combinatorial optimization and applications, 17th international conference, COCOA 2023, Hawaii, HI, USA, December 15\u201317, 2023, proceedings, Part I, volume 14461 of Lecture Notes in Computer Science, pp. 446\u2013458. Springer (2023)","DOI":"10.1007\/978-3-031-49611-0_32"},{"issue":"1","key":"459_CR11","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1137\/20M1356257","volume":"36","author":"M Lafond","year":"2022","unstructured":"Lafond, M., Zhu, B., Zou, P.: Computing the tandem duplication distance is NP-hard. SIAM J. Discret. Math. 36(1), 64\u201391 (2022)","journal-title":"SIAM J. Discret. Math."},{"key":"459_CR12","unstructured":"Lai, W., Liyanage, A., Zhu, B., Zou, P.: Beyond the longest letter-duplicated subsequence problem. In: Bannai, H., Holub, J. (eds.) 33rd annual symposium on combinatorial pattern matching, CPM 2022, June 27\u201329, 2022, Prague, Czech Republic volume 223 of LIPIcs, pp. 7(1\u20137), 12. Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik (2022)"},{"issue":"6822","key":"459_CR13","doi-asserted-by":"publisher","first-page":"860","DOI":"10.1038\/35057062","volume":"409","author":"ES Lander","year":"2001","unstructured":"Lander, E.S., et al.: Initial sequencing and analysis of the human genome. Nature 409(6822), 860\u2013921 (2001)","journal-title":"Nature"},{"issue":"338","key":"459_CR14","doi-asserted-by":"publisher","first-page":"277","DOI":"10.2307\/3610126","volume":"41","author":"J Leech","year":"1957","unstructured":"Leech, J.: A problem on strings of beads. Math. Gaz. 41(338), 277\u2013278 (1957)","journal-title":"Math. Gaz."},{"issue":"6","key":"459_CR15","doi-asserted-by":"publisher","first-page":"971","DOI":"10.1016\/0092-8674(93)90585-E","volume":"72","author":"ME Macdonald","year":"1993","unstructured":"Macdonald, M.E., et al.: A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington\u2019s disease. Cell 72(6), 971\u2013983 (1993)","journal-title":"Cell"},{"key":"459_CR16","doi-asserted-by":"crossref","unstructured":"The Cancer Genome Atlas Research Network: Integrated genomic analyses of ovarian carcinoma. Nature 474, 609\u2013615 (2011)","DOI":"10.1038\/nature10166"},{"issue":"Suppl 6","key":"459_CR17","doi-asserted-by":"publisher","first-page":"S10","DOI":"10.1186\/1471-2105-13-S6-S10","volume":"13","author":"L Oesper","year":"2012","unstructured":"Oesper, L., Ritz, A.M., Aerni, S.J., Drebin, R., Raphael, B.J.: Reconstructing cancer genomes from paired-end sequencing data. BMC Bioinf. 13(Suppl 6), S10 (2012)","journal-title":"BMC Bioinf."},{"key":"459_CR18","unstructured":"Schrinner, S., Goel, M., Wulfert, M., Spohr, P., Schneeberger, K., Klau, G.W.: The longest run subsequence problem. In: Kingsford, C., Pisanti, N. (eds.) 20th international workshop on algorithms in bioinformatics, WABI 2020, September 7\u20139, 2020, Pisa, Italy (virtual conference), volume 172 of LIPIcs, pp. 6(1\u20136), 13. Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik (2020)"},{"issue":"1","key":"459_CR19","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1186\/s13015-021-00191-8","volume":"16","author":"S Schrinner","year":"2021","unstructured":"Schrinner, S., Goel, M., Wulfert, M., Spohr, P., Schneeberger, K., Klau, G.W.: Using the longest run subsequence problem within homology-based scaffolding. Algorithms Mol. Biol. 16(1), 11 (2021)","journal-title":"Algorithms Mol. Biol."},{"issue":"1","key":"459_CR20","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1086\/431652","volume":"77","author":"AJ Sharp","year":"2005","unstructured":"Sharp, A.J., Eichler, E.E., et al.: Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77(1), 78\u201388 (2005)","journal-title":"Am. J. Hum. Genet."},{"key":"459_CR21","doi-asserted-by":"publisher","first-page":"426","DOI":"10.1038\/284426a0","volume":"284","author":"JW Szostak","year":"1980","unstructured":"Szostak, J.W., Ray, W.: Unequal crossing over in the ribosomal DNA of saccharomyces cerevisiae. Nature 284, 426\u2013430 (1980)","journal-title":"Nature"},{"key":"459_CR22","first-page":"162","volume":"70","author":"M-W Wang","year":"2000","unstructured":"Wang, M.-W.: On the irregularity of the duplication closure. Bull. EATCS 70, 162\u2013163 (2000)","journal-title":"Bull. EATCS"},{"issue":"03","key":"459_CR23","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1142\/S0219720009004199","volume":"7","author":"C Zheng","year":"2009","unstructured":"Zheng, C., Wall, P.K., Leebens-Mack, J., Pamphilis, C.D.E., Albert, V.A., Sankoff, D.: Gene loss under neighborhood selection following whole genome duplication and the reconstruction of the ancestral populus genome. J. Bioinform. Comput. Biol. 7(03), 499\u2013520 (2009)","journal-title":"J. Bioinform. Comput. Biol."}],"container-title":["Acta Informatica"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00236-024-00459-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00236-024-00459-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00236-024-00459-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T10:03:07Z","timestamp":1723629787000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00236-024-00459-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,20]]},"references-count":23,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["459"],"URL":"https:\/\/doi.org\/10.1007\/s00236-024-00459-7","relation":{},"ISSN":["0001-5903","1432-0525"],"issn-type":[{"value":"0001-5903","type":"print"},{"value":"1432-0525","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,20]]},"assertion":[{"value":"9 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 July 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 July 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}