{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T04:27:17Z","timestamp":1770438437157,"version":"3.49.0"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"2","funder":[{"name":"JSPS KAKENHI","award":["JP25K21150, JP23H04378, JP24K02899, and JP20H04141"],"award-info":[{"award-number":["JP25K21150, JP23H04378, JP24K02899, and JP20H04141"]}]},{"name":"ROIS NII Open Collaborative Research 2026","award":["252M-23667"],"award-info":[{"award-number":["252M-23667"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Algorithms"],"published-print":{"date-parts":[[2026,4,30]]},"abstract":"<jats:p>Repetitiveness measures reveal profound characteristics of datasets, and give rise to compressed data structures and algorithms working in compressed space. Alas, the computation of some of these measures is NP-hard, and straight-forward computation is infeasible for datasets of even small sizes. Three such measures are the smallest size of a string attractor, the smallest size of a bidirectional macro scheme, and the smallest size of a straight-line program. While a vast variety of implementations for heuristically computing approximations exist, exact computation of these measures has received little to no attention. In this article, we present MAX-SAT formulations that provide the first non-trivial implementations for exact computation of smallest string attractors, smallest bidirectional macro schemes, and smallest straight-line programs. Computational experiments show that our implementations work for texts of length up to a few hundred for straight-line programs and bidirectional macro schemes, and texts even over a million for string attractors.<\/jats:p>","DOI":"10.1145\/3777895","type":"journal-article","created":{"date-parts":[[2025,11,24]],"date-time":"2025-11-24T15:36:55Z","timestamp":1763998615000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Computing NP-hard Repetitiveness Measures via MAX-SAT"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6856-5185","authenticated-orcid":false,"given":"Hideo","family":"Bannai","sequence":"first","affiliation":[{"name":"M&amp;D Data Science Center, Institute of Integrated Research, Institute of Science Tokyo, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6964-6182","authenticated-orcid":false,"given":"Keisuke","family":"Goto","sequence":"additional","affiliation":[{"name":"Independent Researcher, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0971-060X","authenticated-orcid":false,"given":"Masakazu","family":"Ishihata","sequence":"additional","affiliation":[{"name":"NTT Communication Science Laboratories, Kyoto, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5462-122X","authenticated-orcid":false,"given":"Shunsuke","family":"Kanda","sequence":"additional","affiliation":[{"name":"Independent Researcher, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8721-4444","authenticated-orcid":false,"given":"Dominik","family":"K\u00f6ppl","sequence":"additional","affiliation":[{"name":"University of Yamanashi, Kofu, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-3798-8397","authenticated-orcid":false,"given":"Takaaki","family":"Nishimoto","sequence":"additional","affiliation":[{"name":"RIKEN Center for Advanced Intelligence Project, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2295-1299","authenticated-orcid":false,"given":"Bernardo","family":"Subercaseaux","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, Pennsylvania, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,2,6]]},"reference":[{"key":"e_1_3_2_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2022.104999"},{"key":"e_1_3_2_3_1","first-page":"24","volume-title":"Proceedings of the SAT Competition","volume":"2018","author":"Audemard Gilles","year":"2018","unstructured":"Gilles Audemard and Laurent Simon. 2018. Glucose and syrup: Nine years in the SAT competitions. In Proceedings of the SAT Competition. Department of Computer Science Series of Publications B, Vol. B-2018-1, 24\u201325."},{"key":"e_1_3_2_4_1","first-page":"399","volume-title":"Proceedings of the IJCAI","author":"Audemard Gilles","year":"2009","unstructured":"Gilles Audemard and Laurent Simon. 2009. Predicting learnt clauses quality in modern SAT solvers. In Proceedings of the IJCAI, 399\u2013404."},{"key":"e_1_3_2_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-86692-1_14"},{"key":"e_1_3_2_6_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ESA.2022.12"},{"key":"e_1_3_2_7_1","doi-asserted-by":"publisher","unstructured":"Timothy C. Bell Ian H. Witten and John G. Cleary. 1989. Modeling for text compression. ACM Computing Surveys 21 4 (1989) 557\u2013591. DOI: 10.1145\/76894.76896","DOI":"10.1145\/76894.76896"},{"key":"e_1_3_2_8_1","first-page":"10","volume-title":"Proceedings of the SAT Competition 2021\u2014Solver and Benchmark Descriptions","volume":"2021","author":"Biere Armin","year":"2021","unstructured":"Armin Biere, Katalin Fazekas, Mathias Fleury, and Maximillian Heisinger. 2021. CaDiCaL, Kissat, Paracooba, Plingeling and Treengeling entering the SAT competition 2021. In Proceedings of the SAT Competition 2021\u2014Solver and Benchmark Descriptions. Department of Computer Science Report Series B, Vol. B-2021-1, 10\u201313."},{"key":"e_1_3_2_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1550723"},{"key":"e_1_3_2_10_1","doi-asserted-by":"publisher","unstructured":"Philip Bille Travis Gagie Inge Li G\u00f8rtz and Nicola Prezza. 2018. A separation between RLSLPs and LZ77. Journal of Discrete Algorithms 50 (2018) 36\u201339. DOI: 10.1016\/j.jda.2018.09.002","DOI":"10.1016\/j.jda.2018.09.002"},{"key":"e_1_3_2_11_1","doi-asserted-by":"publisher","unstructured":"Anselm Blumer J. Blumer David Haussler Ross M. McConnell and Andrzej Ehrenfeucht. 1987. Complete inverted files for efficient text retrieval and analysis. Journal of the ACM 34 3 (1987) 578\u2013595. DOI: 10.1145\/28869.28873","DOI":"10.1145\/28869.28873"},{"key":"e_1_3_2_12_1","article-title":"Maximum, over all binary strings  \\(w\\)  of length  \\(n\\) , of the size of the smallest string attractor for  \\(w\\)","author":"Branicky Michael S.","year":"2022","unstructured":"Michael S. Branicky. 2022. Maximum, over all binary strings \\(w\\) of length \\(n\\) , of the size of the smallest string attractor for \\(w\\) . In Entry A339391 in The On-Line Encyclopedia of Integer Sequences (OEIS). Retrieved from April 13, 2022 from https:\/\/oeis.org\/A339391","journal-title":"Entry A339391 in The On-Line Encyclopedia of Integer Sequences (OEIS)"},{"key":"e_1_3_2_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00224-020-10013-w"},{"key":"e_1_3_2_14_1","doi-asserted-by":"publisher","unstructured":"Anders Roy Christiansen Mikko Berggren Ettienne Tomasz Kociumaka Gonzalo Navarro and Nicola Prezza. 2021. Optimal-time dictionary-compressed indexes. ACM Transactions on Algorithms 17 1 Article 8 (2021) 1\u201339. DOI: 10.1145\/3426473","DOI":"10.1145\/3426473"},{"key":"e_1_3_2_15_1","doi-asserted-by":"publisher","unstructured":"Maxime Crochemore and Lucian Ilie. 2008. Computing longest previous factor in linear time and applications. Information Processing Letters 106 2 (2008) 75\u201380. DOI: 10.1016\/j.ipl.2007.10.006","DOI":"10.1016\/j.ipl.2007.10.006"},{"key":"e_1_3_2_16_1","doi-asserted-by":"publisher","unstructured":"Maxime Crochemore and German Tischler. 2011. Computing longest previous non-overlapping factors. Information Processing Letters 111 6 (2011) 291\u2013295. DOI: 10.1016\/j.ipl.2010.12.005","DOI":"10.1016\/j.ipl.2010.12.005"},{"key":"e_1_3_2_17_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ESA.2019.41"},{"key":"e_1_3_2_18_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.SEA.2017.13"},{"key":"e_1_3_2_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/11814948_25"},{"key":"e_1_3_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-19929-0_19"},{"key":"e_1_3_2_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-94144-8_26"},{"key":"e_1_3_2_22_1","doi-asserted-by":"publisher","unstructured":"Alexey Ignatiev Ant\u00f3nio Morgado and Jo\u00e3o Marques-Silva. 2019. RC2: An efficient MaxSAT solver. Journal on Satisfiability Boolean Modelling and Computation 11 1 (2019) 53\u201364. DOI: 10.3233\/SAT190116","DOI":"10.3233\/SAT190116"},{"key":"e_1_3_2_23_1","unstructured":"OEIS Foundation Inc. 2022. Maximum over all binary strings \\( w \\) of length \\( n \\) of the size of the smallest string attractor for \\( w \\) entry A339391 in the on-line encyclopedia of integer sequences. Retrieved April 13 2022 from https:\/\/oeis.org\/A339391"},{"key":"e_1_3_2_24_1","unstructured":"Marek Karpinski Wojciech Rytter and Ayumi Shinohara. 1997. An efficient pattern-matching algorithm for strings with short descriptions. Nordic Journal of Computing 4 2 (1997) 172\u2013186."},{"key":"e_1_3_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS46700.2020.00097"},{"key":"e_1_3_2_26_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ESA.2018.52"},{"key":"e_1_3_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3188745.3188814"},{"key":"e_1_3_2_28_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611977073.111"},{"key":"e_1_3_2_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/2898950"},{"key":"e_1_3_2_30_1","doi-asserted-by":"publisher","unstructured":"Tomasz Kociumaka Gonzalo Navarro and Nicola Prezza. 2023. Toward a definitive compressibility measure for repetitive sequences. IEEE Transactions on Information Theory 69 4 (2023) 2074\u20132092. DOI: 10.1109\/TIT.2022.3224382","DOI":"10.1109\/TIT.2022.3224382"},{"key":"e_1_3_2_31_1","doi-asserted-by":"publisher","unstructured":"Sebastian Kreft and Gonzalo Navarro. 2013. On compressing and indexing repetitive sequences. Theoretical Computer Science 483 (2013) 115\u2013133. DOI: 10.1016\/j.tcs.2012.02.006","DOI":"10.1016\/j.tcs.2012.02.006"},{"key":"e_1_3_2_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-59212-7_15"},{"key":"e_1_3_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.1999.755679"},{"key":"e_1_3_2_34_1","doi-asserted-by":"publisher","unstructured":"Abraham Lempel and Jacob Ziv. 1976. On the complexity of finite sequences. IEEE Transactions on Information Theory 22 1 (1976) 75\u201381. DOI: 10.1109\/TIT.1976.1055501","DOI":"10.1109\/TIT.1976.1055501"},{"key":"e_1_3_2_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-11298-1"},{"key":"e_1_3_2_36_1","doi-asserted-by":"crossref","unstructured":"Veli M\u00e4kinen and Gonzalo Navarro. 2005. Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing 12 1 (2005) 40\u201366.","DOI":"10.1007\/11496656_5"},{"key":"e_1_3_2_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-02777-2_45"},{"key":"e_1_3_2_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2020.11.006"},{"key":"e_1_3_2_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0020-0190(02)00512-4"},{"key":"e_1_3_2_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10428-7_39"},{"key":"e_1_3_2_41_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.CPM.2022.26"},{"key":"e_1_3_2_42_1","doi-asserted-by":"publisher","unstructured":"Ant\u00f3nio Morgado Federico Heras Mark H. Liffiton Jordi Planes and Jo\u00e3o Marques-Silva. 2013. Iterative and core-guided MaxSAT solving: A survey and assessment. Constraints: An International Journal 18 4 (2013) 478\u2013534. DOI: 10.1007\/S10601-013-9146-2","DOI":"10.1007\/S10601-013-9146-2"},{"key":"e_1_3_2_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-016-0178-z"},{"key":"e_1_3_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3434399"},{"key":"e_1_3_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3432999"},{"key":"e_1_3_2_46_1","doi-asserted-by":"publisher","unstructured":"Gonzalo Navarro Carlos Ochoa and Nicola Prezza. 2021. On the approximation ratio of ordered parsings. IEEE Transactions on Information Theory 67 2 (2021) 1008\u20131026. DOI: 10.1109\/TIT.2020.3042746","DOI":"10.1109\/TIT.2020.3042746"},{"key":"e_1_3_2_47_1","doi-asserted-by":"publisher","unstructured":"Gonzalo Navarro and Nicola Prezza. 2019. Universal compressed text indexing. Theoretical Computer Science 762 (2019) 41\u201350. DOI: 10.1016\/j.tcs.2018.09.007","DOI":"10.1016\/j.tcs.2018.09.007"},{"key":"e_1_3_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.1997.581951"},{"key":"e_1_3_2_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2021.104859"},{"key":"e_1_3_2_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC47342.2020.00023"},{"key":"e_1_3_2_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(02)00777-6"},{"key":"e_1_3_2_52_1","doi-asserted-by":"publisher","DOI":"10.1587\/transinf.E92.D.158"},{"key":"e_1_3_2_53_1","unstructured":"Hiroshi Sakamoto Shinichi Shimozono Ayumi Shinohara and Masayuki Takeda. 2001. On the Minimization Problem of Text Compression Scheme by a Reduced Grammar Transform. Technical Report 195. Kyushu University."},{"key":"e_1_3_2_54_1","unstructured":"Luke Schaeffer and Jeffrey Shallit. 2020. String attractors for automatic sequences. arXiv:2012.06840. Retrieved from https:\/\/arxiv.org\/abs\/2012.06840"},{"key":"e_1_3_2_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/11564751_73"},{"key":"e_1_3_2_56_1","doi-asserted-by":"publisher","unstructured":"James A. Storer and Thomas G. Szymanski. 1982. Data compression via textural substitution. Journal of the ACM 29 4 (1982) 928\u2013951. DOI: 10.1145\/322344.322346","DOI":"10.1145\/322344.322346"},{"key":"e_1_3_2_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-66158-2_43"},{"key":"e_1_3_2_58_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPICS.SAT.2023.30"},{"key":"e_1_3_2_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1977.1055714"},{"key":"e_1_3_2_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1978.1055934"}],"container-title":["ACM Transactions on Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3777895","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T14:22:56Z","timestamp":1770387776000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3777895"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,6]]},"references-count":59,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,4,30]]}},"alternative-id":["10.1145\/3777895"],"URL":"https:\/\/doi.org\/10.1145\/3777895","relation":{},"ISSN":["1549-6325","1549-6333"],"issn-type":[{"value":"1549-6325","type":"print"},{"value":"1549-6333","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,6]]},"assertion":[{"value":"2024-03-07","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-11-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-06","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}