{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:18:15Z","timestamp":1766067495128,"version":"build-2065373602"},"reference-count":73,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T00:00:00Z","timestamp":1648598400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T00:00:00Z","timestamp":1648598400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2024,5]]},"DOI":"10.1007\/s10994-022-06164-1","type":"journal-article","created":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T21:04:27Z","timestamp":1648674267000},"page":"2619-2653","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Connecting weighted automata, tensor networks and recurrent neural networks through spectral learning"],"prefix":"10.1007","volume":"113","author":[{"given":"Tianyu","family":"Li","sequence":"first","affiliation":[]},{"given":"Doina","family":"Precup","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8090-2810","authenticated-orcid":false,"given":"Guillaume","family":"Rabusseau","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,30]]},"reference":[{"issue":"4","key":"6164_CR1","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1007\/BF00116828","volume":"2","author":"D Angluin","year":"1988","unstructured":"Angluin, D. (1988). Queries and concept learning. Machine Learning, 2(4), 319\u2013342.","journal-title":"Machine Learning"},{"key":"6164_CR2","unstructured":"Avcu, E., Shibata, C., & Heinz, J. (2017). Subregular complexity and deep learning. In CLASP papers in computational linguistics (p.\u00a020)."},{"key":"6164_CR3","unstructured":"Ayache, S., Eyraud, R., & Goudian, N. (2018). Explaining black boxes on sequential data using weighted automata (pp. 81\u2013103)."},{"key":"6164_CR4","doi-asserted-by":"crossref","unstructured":"Bailly, R., Denis, F., & Ralaivola, L. (2009). Grammatical inference as a principal component analysis problem (pp. 33\u201340).","DOI":"10.1145\/1553374.1553379"},{"key":"6164_CR5","unstructured":"Balle, B., & Mohri, M. (2012). Spectral learning of general weighted automata via constrained matrix completion (pp. 2159\u20132167)"},{"issue":"1\u20132","key":"6164_CR6","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1007\/s10994-013-5416-x","volume":"96","author":"B Balle","year":"2014","unstructured":"Balle, B., Carreras, X., Luque, F. M., & Quattoni, A. (2014). Spectral learning of weighted automata. Machine Learning, 96(1\u20132), 33\u201363.","journal-title":"Machine Learning"},{"key":"6164_CR7","unstructured":"Biamonte, J., & Bergholm, V. (2017). Tensor networks in a nutshell. arXiv preprint arXiv:170800006."},{"issue":"7","key":"6164_CR8","doi-asserted-by":"publisher","first-page":"954","DOI":"10.1177\/0278364911404092","volume":"30","author":"B Boots","year":"2011","unstructured":"Boots, B., Siddiqi, S. M., & Gordon, G. J. (2011). Closing the learning-planning loop with predictive state representations. International Journal of Robotics Research, 30(7), 954\u2013966.","journal-title":"International Journal of Robotics Research"},{"issue":"1","key":"6164_CR9","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1214\/14-AOS1267","volume":"43","author":"TT Cai","year":"2015","unstructured":"Cai, T. T., & Zhang, A.. (2015). Rop: Matrix recovery via rank-one projections. The Annals of Statistics, 43(1), 102\u2013138.","journal-title":"The Annals of Statistics"},{"issue":"4","key":"6164_CR10","doi-asserted-by":"publisher","first-page":"2342","DOI":"10.1109\/TIT.2011.2111771","volume":"57","author":"EJ Candes","year":"2011","unstructured":"Candes, E. J., & Plan, Y. (2011). Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Transactions on Information Theory, 57(4), 2342\u20132359.","journal-title":"IEEE Transactions on Information Theory"},{"issue":"1","key":"6164_CR11","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1016\/S0022-0000(71)80005-3","volume":"5","author":"JW Carlyle","year":"1971","unstructured":"Carlyle, J. W., & Paz, A. (1971). Realizations by stochastic finite automata. Journal of Computer and System Sciences, 5(1), 26\u201340.","journal-title":"Journal of Computer and System Sciences"},{"key":"6164_CR12","unstructured":"Caron, R., & Traynor, T. (2005). The zero set of a polynomial. WSMR Report, 05\u201302."},{"key":"6164_CR13","doi-asserted-by":"crossref","unstructured":"Chen, Y., Gilroy, S., Maletti, A., May, J., & Knight, K. (2018). Recurrent neural networks as weighted language recognizers (pp. 2261\u20132271).","DOI":"10.18653\/v1\/N18-1205"},{"issue":"4\u20135","key":"6164_CR14","first-page":"249","volume":"9","author":"A Cichocki","year":"2016","unstructured":"Cichocki, A., Lee, N., Oseledets, I., Phan, A. H., Zhao, Q., Mandic, D. P., & Sugiyama, M. (2016). Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions. Foundations and Trends&#x00AE. Machine Learning, 9(4\u20135), 249\u2013429.","journal-title":"Machine Learning"},{"key":"6164_CR15","unstructured":"Cohen, N., Sharir, O., & Shashua, A. (2016). On the expressive power of deep learning: A tensor analysis (pp. 698\u2013728)."},{"key":"6164_CR16","unstructured":"Critch, A. (2013). Algebraic geometry of hidden markov and related models. University of California. PhD thesis."},{"issue":"095","key":"6164_CR17","first-page":"095","volume":"10","author":"A Critch","year":"2014","unstructured":"Critch, A., & Morton, J. (2014). Algebraic geometry of matrix product states. SIGMA, 10(095), 095.","journal-title":"SIGMA"},{"key":"6164_CR18","unstructured":"Denis, F., & Esposito, Y. (2008). On rational stochastic languages. Fundamenta Informaticae, 86(1, 2), 41\u201377."},{"key":"6164_CR19","unstructured":"Downey, C., Hefny, A., Boots, B., Gordon, G. J., & Li, B. (2017). Predictive state recurrent neural networks (pp. 6055\u20136066)."},{"issue":"2","key":"6164_CR20","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1207\/s15516709cog1402_1","volume":"14","author":"JL Elman","year":"1990","unstructured":"Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179\u2013211.","journal-title":"Cognitive Science"},{"key":"6164_CR21","unstructured":"Federer, H. (2014). Geometric measure theory. Springer."},{"issue":"9","key":"6164_CR22","first-page":"197","volume":"53","author":"M Fliess","year":"1974","unstructured":"Fliess, M. (1974). Matrices de Hankel. Journal de Math\u00e9matiques Pures et Appliqu\u00e9es, 53(9), 197\u2013222.","journal-title":"Journal de Math\u00e9matiques Pures et Appliqu\u00e9es"},{"key":"6164_CR23","unstructured":"Gel\u00df, P. (2017). The tensor-train format and its applications: Modeling and analysis of chemical reaction networks, catalytic processes, fluid flows, and brownian dynamics. PhD thesis"},{"issue":"10","key":"6164_CR24","doi-asserted-by":"publisher","first-page":"2451","DOI":"10.1162\/089976600300015015","volume":"12","author":"FA Gers","year":"2000","unstructured":"Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10), 2451\u20132471.","journal-title":"Neural Computation"},{"key":"6164_CR25","unstructured":"Giles, C. L., Sun, G. Z., Chen, H. H., Lee, Y. C., & Chen, D. (1990). Higher order recurrent networks and grammatical inference (pp. 380\u2013387)."},{"issue":"3","key":"6164_CR26","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1162\/neco.1992.4.3.393","volume":"4","author":"CL Giles","year":"1992","unstructured":"Giles, C. L., Miller, C. B., Chen, D., Chen, H. H., Sun, G. Z., & Lee, Y. C. (1992). Learning and extracting finite state automata with second-order recurrent neural networks. Neural Computation, 4(3), 393\u2013405.","journal-title":"Neural Computation"},{"key":"6164_CR27","unstructured":"Glaude, H., & Pietquin, O. (2016). PAC learning of probabilistic automaton based on the method of moments (pp. 820\u2013829)."},{"key":"6164_CR28","doi-asserted-by":"crossref","unstructured":"Graves, A., Mohamed, A., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks (pp. 6645\u20136649).","DOI":"10.1109\/ICASSP.2013.6638947"},{"key":"6164_CR29","doi-asserted-by":"crossref","unstructured":"Han, Z. Y., Wang, J., Fan, H., Wang, L., & Zhang, P. (2018). Unsupervised generative modeling using matrix product states. Physical Review X, 8(3), 031012.","DOI":"10.1103\/PhysRevX.8.031012"},{"issue":"6","key":"6164_CR30","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1145\/2512329","volume":"60","author":"CJ Hillar","year":"2013","unstructured":"Hillar, C. J., & Lim, L. H. (2013). Most tensor problems are np-hard. Journal of the ACM (JACM), 60(6), 45.","journal-title":"Journal of the ACM (JACM)"},{"issue":"2","key":"6164_CR31","doi-asserted-by":"publisher","first-page":"A683","DOI":"10.1137\/100818893","volume":"34","author":"S Holtz","year":"2012","unstructured":"Holtz, S., Rohwedder, T., & Schneider, R. (2012). The alternating linear scheme for tensor optimization in the tensor train format. SIAM Journal on Scientific Computing, 34(2), A683\u2013A713.","journal-title":"SIAM Journal on Scientific Computing"},{"key":"6164_CR32","unstructured":"Hsu, D. J., Kakade, S. M., & Zhang, T. (2009). A spectral algorithm for learning hidden Markov models."},{"key":"6164_CR33","unstructured":"Jain, P., Meka, R., & Dhillon, I. S. (2010). Guaranteed rank minimization via singular value projection (pp. 937\u2013945)."},{"key":"6164_CR34","unstructured":"Khrulkov, V., Novikov, A., & Oseledets, I. (2018). Expressive power of recurrent neural networks."},{"key":"6164_CR35","unstructured":"Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization."},{"issue":"7","key":"6164_CR36","doi-asserted-by":"publisher","first-page":"3359","DOI":"10.1088\/1361-6544\/aabc8f","volume":"31","author":"S Klus","year":"2018","unstructured":"Klus, S., Gel\u00df, P., Peitz, S., & Sch\u00fctte, C. (2018). Tensor-based dynamic mode decomposition. Nonlinearity, 31(7), 3359.","journal-title":"Nonlinearity"},{"issue":"3","key":"6164_CR37","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1137\/07070111X","volume":"51","author":"TG Kolda","year":"2009","unstructured":"Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455\u2013500.","journal-title":"SIAM Review"},{"issue":"1\u20133","key":"6164_CR38","doi-asserted-by":"publisher","first-page":"276","DOI":"10.1016\/0167-2789(86)90300-6","volume":"22","author":"Y Lee","year":"1986","unstructured":"Lee, Y., Doolen, G., Chen, H., Sun, G., Maxwell, T., Lee, H., & Giles, C. L. (1986). Machine learning using a higher order correlation network. Physica D: Nonlinear Phenomena, 22(1\u20133), 276\u2013306.","journal-title":"Physica D: Nonlinear Phenomena"},{"key":"6164_CR39","unstructured":"Li, T., Rabusseau, G., & Precup, D. (2018). Nonlinear weighted finite automata (pp. 679\u2013688)."},{"key":"6164_CR40","unstructured":"Lin, Q., Hammerschmidt, C., Pellegrino, G., & Verwer, S. (2016). Short-term time series forecasting with regression automata."},{"issue":"2","key":"6164_CR41","doi-asserted-by":"publisher","first-page":"470","DOI":"10.1137\/120885723","volume":"34","author":"C Lubich","year":"2013","unstructured":"Lubich, C., Rohwedder, T., Schneider, R., & Vandereycken, B. (2013). Dynamical approximation by hierarchical tucker and tensor-train tensors. SIAM Journal on Matrix Analysis and Applications, 34(2), 470\u2013494.","journal-title":"SIAM Journal on Matrix Analysis and Applications"},{"key":"6164_CR42","unstructured":"Ma, X., Zhang, P., Zhang, S., Duan, N., Hou, Y., Zhou, M., & Song, D. (2019). A tensorized transformer for language modeling. In Advances in neural information processing systems (pp. 2229\u20132239)"},{"key":"6164_CR43","doi-asserted-by":"crossref","unstructured":"Merrill, W., Weiss, G., Goldberg, Y., Schwartz, R., Smith, N. A., & Yahav, E. (2020). A formal hierarchy of rnn architectures (pp. 443\u2013459).","DOI":"10.18653\/v1\/2020.acl-main.43"},{"key":"6164_CR44","doi-asserted-by":"crossref","unstructured":"Mikolov, T., Kombrink, S., Burget, L., \u010cernock\u1ef3, J., & Khudanpur, S. (2011). Extensions of recurrent neural network language model (pp. 5528\u20135531).","DOI":"10.1109\/ICASSP.2011.5947611"},{"key":"6164_CR45","unstructured":"Miller, J., Rabusseau, G., & Terilla, J. (2020) Tensor networks for language modeling. arXiv preprint arXiv: 200301039."},{"key":"6164_CR46","unstructured":"Novikov, A., Rodomanov, A., Osokin, A., & Vetrov, D. (2014). Putting mrfs on a tensor train (pp. 811\u2013819)."},{"key":"6164_CR47","unstructured":"Novikov, A., Podoprikhin, D., Osokin, A., & Vetrov, D. P. (2015). Tensorizing neural networks. In Advances in neural information processing systems (pp. 442\u2013450)"},{"issue":"6","key":"6164_CR48","doi-asserted-by":"publisher","first-page":"937","DOI":"10.1145\/235809.235811","volume":"43","author":"CW Omlin","year":"1996","unstructured":"Omlin, C. W., & Giles, C. L. (1996). Constructing deterministic finite-state automata in recurrent neural networks. Journal of the ACM (JACM), 43(6), 937\u2013972.","journal-title":"Journal of the ACM (JACM)"},{"key":"6164_CR49","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1016\/j.aop.2014.06.013","volume":"349","author":"R Or\u00fas","year":"2014","unstructured":"Or\u00fas, R. (2014). A practical introduction to tensor networks: Matrix product states and projected entangled pair states. Annals of Physics, 349, 117\u2013158.","journal-title":"Annals of Physics"},{"issue":"5","key":"6164_CR50","doi-asserted-by":"publisher","first-page":"2295","DOI":"10.1137\/090752286","volume":"33","author":"IV Oseledets","year":"2011","unstructured":"Oseledets, I. V. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing, 33(5), 2295\u20132317.","journal-title":"SIAM Journal on Scientific Computing"},{"key":"6164_CR51","doi-asserted-by":"crossref","unstructured":"Peng, H., Schwartz, R., Thomson, S., & Smith, N. A. (2018). Rational recurrences (pp. 1203\u20131214).","DOI":"10.18653\/v1\/D18-1152"},{"key":"6164_CR52","doi-asserted-by":"crossref","unstructured":"Pollack, J. B. (1991). The induction of dynamical recognizers. In Connectionist approaches to language learning (pp. 123\u2013148). Springer.","DOI":"10.1007\/978-1-4615-4008-3_6"},{"key":"6164_CR53","unstructured":"Quattoni, A., Carreras, X., & Gall\u00e9, M. (2017). A maximum matching algorithm for basis selection in spectral learning (pp. 1477\u20131485)."},{"key":"6164_CR54","unstructured":"Rabusseau, G. (2016). A tensor perspective on weighted automata, low-rank regression and algebraic mixtures. Aix-Marseille Universit\u00e9. PhD thesis."},{"key":"6164_CR55","unstructured":"Rabusseau, G., Balle, B., & Pineau, J. (2017). Multitask spectral learning of weighted automata (pp. 2585\u20132594)"},{"key":"6164_CR56","unstructured":"Rabusseau, G., Li, T., & Precup, D. (2019). Connecting weighted automata and recurrent neural networks through spectral learning (pp. 1630\u20131639)."},{"key":"6164_CR57","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1016\/j.laa.2017.02.028","volume":"523","author":"H Rauhut","year":"2017","unstructured":"Rauhut, H., Schneider, R., & Stojanac, \u017d. (2017). Low rank tensor recovery via iterative hard thresholding. Linear Algebra and its Applications, 523, 220\u2013262.","journal-title":"Linear Algebra and its Applications"},{"key":"6164_CR58","doi-asserted-by":"crossref","unstructured":"Recasens, A., & Quattoni, A. (2013). Spectral learning of sequence taggers over continuous sequences (pp. 289\u2013304).","DOI":"10.1007\/978-3-642-40988-2_19"},{"issue":"3","key":"6164_CR59","doi-asserted-by":"publisher","first-page":"471","DOI":"10.1137\/070697835","volume":"52","author":"B Recht","year":"2010","unstructured":"Recht, B., Fazel, M., & Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471\u2013501.","journal-title":"SIAM Review"},{"issue":"1","key":"6164_CR60","doi-asserted-by":"publisher","first-page":"96","DOI":"10.1016\/j.aop.2010.09.012","volume":"326","author":"U Schollw\u00f6ck","year":"2011","unstructured":"Schollw\u00f6ck, U. (2011). The density-matrix renormalization group in the age of matrix product states. Annals of Physics, 326(1), 96\u2013192.","journal-title":"Annals of Physics"},{"key":"6164_CR61","unstructured":"Sedghi, H., & Anandkumar, A. (2016). Training input-output recurrent neural networks through spectral methods. arXiv preprint arXiv:160300954."},{"key":"6164_CR62","doi-asserted-by":"crossref","unstructured":"Siegelmann, H. T., & Sontag, E. D. (1992). On the computational power of neural nets. In Proceedings of COLT (pp. 440\u2013449). ACM.","DOI":"10.1145\/130385.130432"},{"key":"6164_CR63","unstructured":"Socher. R., Chen, D., Manning, C. D., & Ng, A. (2013a). Reasoning with neural tensor networks for knowledge base completion. In Advances in neural information processing systems (pp. 926\u2013934)."},{"key":"6164_CR64","unstructured":"Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C. (2013b). Recursive deep models for semantic compositionality over a sentiment treebank (pp. 1631\u20131642)."},{"key":"6164_CR65","unstructured":"Stoudenmire, E., & Schwab, D. J. (2016). Supervised learning with tensor networks. In Advances in neural information processing systems (pp. 4799\u20134807)"},{"key":"6164_CR66","unstructured":"Sutskever, I., Martens, J., & Hinton, G. E. (2011). Generating text with recurrent neural networks (pp. 1017\u20131024)."},{"key":"6164_CR67","first-page":"103","volume":"16","author":"M Thon","year":"2015","unstructured":"Thon, M., & Jaeger, H. (2015). Links between multiplicity automata, observable operator models and predictive state representations: a unified learning framework. Journal of Machine Learning Research, 16, 103\u2013147.","journal-title":"Journal of Machine Learning Research"},{"key":"6164_CR68","doi-asserted-by":"crossref","unstructured":"Tjandra, A., Sakti, S., & Nakamura, S. (2017). Compressing recurrent neural network with tensor train (pp. 4451\u20134458).","DOI":"10.1109\/IJCNN.2017.7966420"},{"key":"6164_CR69","doi-asserted-by":"crossref","unstructured":"Wang, W., Aggarwal, V., & Aeron, S. (2017). Efficient low rank tensor ring completion (pp. 5697\u20135705).","DOI":"10.1109\/ICCV.2017.607"},{"key":"6164_CR70","unstructured":"Weiss, G., Goldberg, Y., & Yahav, E. (2018). Extracting automata from recurrent neural networks using queries and counterexamples (pp. 5244\u20135253)."},{"key":"6164_CR71","unstructured":"Wu, Y., Zhang, S., Zhang, Y., Bengio, Y., & Salakhutdinov, R. R. (2016). On multiplicative integration with recurrent neural networks (pp. 2856\u20132864)."},{"key":"6164_CR72","unstructured":"Yang, Y., Krompass, D., & Tresp, V. (2017). Tensor-train recurrent neural networks for video classification (pp. 3891\u20133900)."},{"key":"6164_CR73","unstructured":"Yu, R., Zheng, S., Anandkumar, A., & Yue, Y. (2017). Long-term forecasting using tensor-train rnns. arXiv preprint arXiv:171100073."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06164-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-022-06164-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06164-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,2]],"date-time":"2024-05-02T18:07:04Z","timestamp":1714673224000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-022-06164-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,30]]},"references-count":73,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,5]]}},"alternative-id":["6164"],"URL":"https:\/\/doi.org\/10.1007\/s10994-022-06164-1","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"type":"print","value":"0885-6125"},{"type":"electronic","value":"1573-0565"}],"subject":[],"published":{"date-parts":[[2022,3,30]]},"assertion":[{"value":"9 May 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 September 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 December 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 March 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}