{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:50:41Z","timestamp":1760233841304,"version":"build-2065373602"},"reference-count":39,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T00:00:00Z","timestamp":1614124800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>We consider the problem of determinizing and minimizing automata for nested words in practice. For this we compile the nested regular expressions (NREs) from the usual XPath benchmark to nested word automata (NWAs). The determinization of these NWAs, however, fails to produce reasonably small automata. In the best case, huge deterministic NWAs are produced after few hours, even for relatively small NREs of the benchmark. We propose a different approach to the determinization of automata for nested words. For this, we introduce stepwise hedge automata (SHAs) that generalize naturally on both (stepwise) tree automata and on finite word automata. We then show how to determinize SHAs, yielding reasonably small deterministic automata for the NREs from the XPath benchmark. The size of deterministic SHAs automata can be reduced further by a novel minimization algorithm for a subclass of SHAs. In order to understand why the new approach to determinization and minimization works so nicely, we investigate the relationship between NWAs and SHAs further. Clearly, deterministic SHAs can be compiled to deterministic NWAs in linear time, and conversely NWAs can be compiled to nondeterministic SHAs in polynomial time. Therefore, we can use SHAs as intermediates for determinizing NWAs, while avoiding the huge size increase with the usual determinization algorithm for NWAs. Notably, the NWAs obtained from the SHAs perform bottom-up and left-to-right computations only, but no top-down computations. This NWA behavior can be distinguished syntactically by the (weak) single-entry property, suggesting a close relationship between SHAs and single-entry NWAs. In particular, it turns out that the usual determinization algorithm for NWAs behaves well for single-entry NWAs, while it quickly explodes without the single-entry property. Furthermore, it is known that the class of deterministic multi-module single-entry NWAs enjoys unique minimization. The subclass of deterministic SHAs to which our novel minimization algorithm applies is different though, in that we do not impose multiple modules. As further optimizations for reducing the sizes of the constructed SHAs, we propose schema-based cleaning and symbolic representations based on apply-else rules that can be maintained by determinization. We implemented the optimizations and report the experimental results for the automata constructed for the XPathMark benchmark.<\/jats:p>","DOI":"10.3390\/a14030068","type":"journal-article","created":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T20:53:21Z","timestamp":1614200001000},"page":"68","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Determinization and Minimization of Automata for Nested Words Revisited"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2611-8950","authenticated-orcid":false,"given":"Joachim","family":"Niehren","sequence":"first","affiliation":[{"name":"Inria, Universit\u00e9 de Lille, 59000 Lille, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6802-5480","authenticated-orcid":false,"given":"Momar","family":"Sakho","sequence":"additional","affiliation":[{"name":"Inria, Universit\u00e9 de Lille, 59000 Lille, France"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,24]]},"reference":[{"key":"ref_1","first-page":"422","article-title":"Pebbling Moutain Ranges and its Application of DCFL-Recognition","volume":"Volume 85","year":"1980","journal-title":"Automata, Languages and Programming, Proceedings of the 7th Colloquium, Noordweijkerhout, The Netherlands, 14\u201318 July 1980"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0304-0208(08)73072-X","article-title":"Input Driven Languages are Recognized in log n Space","volume":"102","author":"Verbeek","year":"1985","journal-title":"North Holl. Math. Stud."},{"key":"ref_3","unstructured":"Alur, R. (2004, January 14\u201316). Marrying Words and Trees. Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Paris, France."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1145\/2636805.2636821","article-title":"Complexity of input-driven pushdown automata","volume":"45","author":"Okhotin","year":"2014","journal-title":"SIGACT News"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Alur, R., and Madhusudan, P. (2004, January 13\u201316). Visibly pushdown languages. Proceedings of the 36th ACM Symposium on Theory of Computing, Chicago, IL, USA.","DOI":"10.1145\/1007352.1007390"},{"key":"ref_6","first-page":"134","article-title":"Locating Matches of Tree Patterns in Forests","volume":"Volume 1530","author":"Neumann","year":"1998","journal-title":"Proceedings of the Foundations of Software Technology and Theoretical Computer Science, Chennai, India, 17\u201319 December 1998"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.ipl.2008.08.002","article-title":"Streaming Tree Automata","volume":"109","author":"Gauwin","year":"2008","journal-title":"Inf. Process. Lett."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1016\/S0022-0000(67)80022-9","article-title":"Characterizing derivation trees of context-free grammars through a generalization of automata theory","volume":"1","author":"Thatcher","year":"1967","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_9","unstructured":"Comon, H., Dauchet, M., Gilleron, R., L\u00f6ding, C., Jacquemard, F., Lugiez, D., Tison, S., and Tommasi, M. (2021, February 01). Tree Automata Techniques and Applications. Available online: http:\/\/tata.gforge.inria.fr."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1145\/767193.767195","article-title":"XDuce: A statically typed XML processing language","volume":"3","author":"Hosoya","year":"2003","journal-title":"ACM Trans. Internet Technol."},{"key":"ref_11","first-page":"150","article-title":"From Regular Expressions to Nested Words: Unifying Languages and Query Execution for Relational and XML Sequences","volume":"3","author":"Mozafari","year":"2010","journal-title":"PVLDB"},{"key":"ref_12","first-page":"1","article-title":"Visibly Pushdown Expression Effects for XML Stream Processing","volume":"1060","author":"Pitcher","year":"2005","journal-title":"Program. Lang. Technol. XML"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"934","DOI":"10.1109\/TKDE.2007.1063","article-title":"SPEX: Streamed and Progressive Evaluation of XPath","volume":"19","author":"Olteanu","year":"2007","journal-title":"IEEE Trans. Know. Data Eng."},{"key":"ref_14","first-page":"3","article-title":"Streamable Fragments of Forward XPath","volume":"Volume 6807","author":"Markhoff","year":"2011","journal-title":"Proceedings of the International Conference on Implementation and Application of Automata, Blois, France, 13\u201316 July 2011"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Benedikt, M., Jeffrey, A., and Ley-Wild, R. (2008, January 10\u201312). Stream Firewalling of XML Constraints. Proceedings of the ACM SIGMOD International Conference on Management of Data, Vancouver, BC, Canada.","DOI":"10.1145\/1376616.1376667"},{"key":"ref_16","first-page":"121","article-title":"Earliest Query Answering for Deterministic Nested Word Automata","volume":"Volume 5699","author":"Gauwin","year":"2009","journal-title":"Proceedings of the 17th International Symposium on Fundamentals of Computer Theory, Wroclaw, Poland, 2\u20134 September 2009"},{"key":"ref_17","unstructured":"Franceschet, M. (2020, October 25). XPathMark Performance Test. Available online: https:\/\/users.dimi.uniud.it\/~massimo.franceschet\/xpathmark\/PTbench.html."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.tcs.2015.01.017","article-title":"Early nested word automata for XPath query answering on XML streams","volume":"578","author":"Debarbieux","year":"2015","journal-title":"Theor. Comput. Sci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1016\/0304-3975(93)90287-4","article-title":"Regular Expressions into Finite Automata","volume":"120","year":"1993","journal-title":"Theor. Comput. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1006\/inco.1997.2695","article-title":"One-Unambiguous Regular Languages","volume":"142","author":"Wood","year":"1998","journal-title":"Inf. Comput."},{"key":"ref_21","first-page":"105","article-title":"Querying Unranked Trees with Stepwise Tree Automata","volume":"Volume 3091","author":"Carme","year":"2004","journal-title":"Proceedings of the 19th International Conference on Rewriting Techniques and Applications, Paris, France, 26\u201328 June 2004"},{"key":"ref_22","first-page":"1102","article-title":"Congruences for Visibly Pushdown Languages","volume":"Volume 3580","author":"Alur","year":"2005","journal-title":"Automata, Languages and Programming, Proceedings of the 32nd International Colloquium, Lisbon, Portugal, 11\u201315 July 2005"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Ku\u010dera, L., and Ku\u010dera, A. (2007). Minimizing Variants of Visibly Pushdown Automata. Mathematical Foundations of Computer Science 2007, Springer.","DOI":"10.1007\/978-3-540-74456-6"},{"key":"ref_24","unstructured":"Gauwin, O., Muscholl, A., and Raskin, M. (2020). Minimization of visibly pushdown automata is NP-complete. Log. Methods Comput. Sci., 16."},{"key":"ref_25","first-page":"169","article-title":"Nested Regular Expressions Can Be Compiled to Small Deterministic Nested Word Automata","volume":"Volume 12159","author":"Fernau","year":"2020","journal-title":"Computer Science\u2014Theory and Applications, Proceedings of the 15th International Computer Science Symposium in Russia (CSR 2020), Yekaterinburg, Russia, 29 June\u20133 July 2020"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gottlob, G., Koch, C., and Pichler, R. (2003, January 9\u201312). The complexity of XPath query evaluation. Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, San Diego, CA, USA.","DOI":"10.1145\/773153.773171"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Libkin, L., Martens, W., and Vrgo\u010d, D. (2013, January 18\u201322). Querying Graph Databases with XPath. Proceedings of the 16th International Conference on Database Theory (ICDT\u201913), Genoa, Italy.","DOI":"10.1145\/2448496.2448513"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1016\/0022-0000(79)90046-1","article-title":"Propositional Dynamic Logic of Regular Programs","volume":"18","author":"Fischer","year":"1979","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_29","unstructured":"Candan, K.S., Chen, Y., Snodgrass, R.T., Gravano, L., and Fuxman, A. (2012). High-performance complex event processing over XML streams. Proceedings of the SIGMOD Conference, Scottsdale, AZ, USA, 20\u201324 May 2012, ACM."},{"key":"ref_30","unstructured":"Grez, A., Riveros, C., and Ugarte, M. (2019, January 26\u201328). A Formal Framework for Complex Event Processing. Proceedings of the 22nd International Conference on Database Theory (ICDT 2019), Lisbon, Portugal."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1007\/s00236-013-0190-6","article-title":"Visibly Rational Expressions","volume":"51","author":"Bozzelli","year":"2014","journal-title":"Acta Inf."},{"key":"ref_32","unstructured":"G\u00e9cseg, F., and Steinby, M. (1984). Tree Automata, Akad\u00e9miai Kiad\u00f3."},{"key":"ref_33","unstructured":"Scott, D., and de Bakker, J.W. (1969). A Theory of Programs, IBM. Unpublished Manuscript."},{"key":"ref_34","unstructured":"Stockmeyer, L.J., and Meyer, A.R. (May, January 30). Word Problems Requiring Exponential Time. Proceedings of the 5th ACM Symposium on Theory of Computing, Austin, TX, USA."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1016\/j.ic.2009.03.003","article-title":"Efficient Inclusion Checking for Deterministic Tree Automata and XML Schemas","volume":"207","author":"Gilleron","year":"2009","journal-title":"Inf. Comput."},{"key":"ref_36","unstructured":"Aho, A.V., Lam, M.S., Sethi, R., and Ullman, J.D. (2006). Compilers: Principles, Techniques, and Tools, Addison Wesley. [2nd ed.]."},{"key":"ref_37","unstructured":"Arnold, A., and Niwi\u0144ski, D. (2001). Complete lattices and fixed-point theorems. Rudiments of \u03bc-Calculus, Elsevier. Studies in Logic and the Foundations of Mathematics."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1016\/j.jcss.2006.10.021","article-title":"On the Minimization of XML-Schemas and Tree Automata for Unranked Trees","volume":"73","author":"Martens","year":"2007","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_39","first-page":"209","article-title":"Symbolic Visibly Pushdown Automata","volume":"Volume 8559","author":"Biere","year":"2014","journal-title":"Computer Aided Verification, Proceedings of the 26th International Conference (CAV, VSL 2014), Vienna, Austria, 18\u201322 July 2014"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/3\/68\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:27:34Z","timestamp":1760160454000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/3\/68"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,24]]},"references-count":39,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2021,3]]}},"alternative-id":["a14030068"],"URL":"https:\/\/doi.org\/10.3390\/a14030068","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2021,2,24]]}}}