{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T18:23:07Z","timestamp":1772302987725,"version":"3.50.1"},"reference-count":55,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2024,5,30]],"date-time":"2024-05-30T00:00:00Z","timestamp":1717027200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Big Data"],"abstract":"<jats:p>Learning from complex, multidimensional data has become central to computational mathematics, and among the most successful high-dimensional function approximators are deep neural networks (DNNs). Training DNNs is posed as an optimization problem to learn network weights or parameters that well-approximate a mapping from input to target data. Multiway data or tensors arise naturally in myriad ways in deep learning, in particular as input data and as high-dimensional weights and features extracted by the network, with the latter often being a bottleneck in terms of speed and memory. In this work, we leverage tensor representations and processing to efficiently parameterize DNNs when learning from high-dimensional data. We propose tensor neural networks (t-NNs), a natural extension of traditional fully-connected networks, that can be trained efficiently in a reduced, yet more powerful parameter space. Our t-NNs are built upon matrix-mimetic tensor-tensor products, which retain algebraic properties of matrix multiplication while capturing high-dimensional correlations. Mimeticity enables t-NNs to inherit desirable properties of modern DNN architectures. We exemplify this by extending recent work on stable neural networks, which interpret DNNs as discretizations of differential equations, to our multidimensional framework. We provide empirical evidence of the parametric advantages of t-NNs on dimensionality reduction using autoencoders and classification using fully-connected and stable variants on benchmark imaging datasets MNIST and CIFAR-10.<\/jats:p>","DOI":"10.3389\/fdata.2024.1363978","type":"journal-article","created":{"date-parts":[[2024,5,30]],"date-time":"2024-05-30T12:32:00Z","timestamp":1717072320000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Stable tensor neural networks for efficient deep learning"],"prefix":"10.3389","volume":"7","author":[{"given":"Elizabeth","family":"Newman","sequence":"first","affiliation":[]},{"given":"Lior","family":"Horesh","sequence":"additional","affiliation":[]},{"given":"Haim","family":"Avron","sequence":"additional","affiliation":[]},{"given":"Misha E.","family":"Kilmer","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2024,5,30]]},"reference":[{"key":"B1","volume-title":"Numerical Methods for Evolutionary Differential Equations","author":"Ascher","year":"2010"},{"key":"B2","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1109\/72.279181","article-title":"Learning long-term dependencies with gradient descent is difficult","volume":"5","author":"Bengio","year":"1994","journal-title":"IEEE Trans. Neural Netw"},{"key":"B3","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1137\/16M1080173","article-title":"Optimization methods for large-scale machine learning","volume":"60","author":"Bottou","year":"2018","journal-title":"SIAM Rev."},{"key":"B4","doi-asserted-by":"crossref","DOI":"10.1201\/b10905","volume-title":"Handbook of Markov Chain Monte Carlo","author":"Brooks","year":"2011"},{"key":"B5","doi-asserted-by":"publisher","DOI":"10.48550\/arxiv.1712.09520","article-title":"Tensor regression networks with various low-rank tensor approximations","author":"Cao","year":"2017","journal-title":"arXiv"},{"key":"B6","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1007\/BF02310791","article-title":"Analysis of individual differences in multidimensional scaling via an n-way generalization of \u201cEckart-Young\u201d decomposition","volume":"35","author":"Carroll","year":"1970","journal-title":"Psychometrika"},{"key":"B7","doi-asserted-by":"publisher","first-page":"1317","DOI":"10.1038\/s41598-020-57897-9","article-title":"Predicting clustered weather patterns: a test case for applications of convolutional neural networks to spatio-temporal climate data","volume":"10","author":"Chattopadhyay","year":"2020","journal-title":"Sci. Rep"},{"key":"B8","doi-asserted-by":"publisher","first-page":"1998","DOI":"10.1109\/TNNLS.2017.2690379","article-title":"Tensor-factorized neural networks","volume":"29","author":"Chien","year":"2018","journal-title":"IEEE Trans. Neural Netw"},{"key":"B9","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1561\/2200000059","article-title":"Tensor networks for dimensionality reduction and large-scale optimization: part 1 low-rank tensor decompositions","volume":"9","author":"Cichocki","year":"2016","journal-title":"Found. Trends Mach. Learn"},{"key":"B10","doi-asserted-by":"publisher","first-page":"1253","DOI":"10.1137\/S0895479896305696","article-title":"A multilinear singular value decomposition","volume":"21","author":"de Lathauwer","year":"2000","journal-title":"SIAM J. Matrix Anal. Appl"},{"key":"B11","unstructured":"\u201cPredicting parameters in deep learning,\u201d21482156\n            DenilM.\n            ShakibiB.\n            DinhL.\n            RanzatoM.\n            de FrietasN.\n          Advances in Neural Information Processing Systems 262013"},{"key":"B12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s40304-017-0103-z","article-title":"A proposal on machine learning via dynamical systems","volume":"5","author":"Ee","year":"2017","journal-title":"Comm. Math. Stat"},{"key":"B13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1088\/1361-6420\/aa9a90","article-title":"Stable architectures for deep neural networks","volume":"34","author":"Haber","year":"2017","journal-title":"Inverse Probl"},{"key":"B14","doi-asserted-by":"publisher","first-page":"3142","DOI":"10.1609\/aaai.v32i1.11680","article-title":"Learning across scales\u2013multiscale methods for convolution neural networks","volume":"32","author":"Haber","year":"2018","journal-title":"Proc. AAAI Conf. Artif. Intell"},{"key":"B15","doi-asserted-by":"publisher","first-page":"437","DOI":"10.1137\/110842570","article-title":"Facial recognition using tensor-tensor decompositions","volume":"6","author":"Hao","year":"2013","journal-title":"SIAM J. Imaging Sci"},{"key":"B16","first-page":"1","article-title":"\u201cFoundations of the parafac procedure: models and conditions for an \u201cexplanatory\u201d multimodal factor analysis,\u201d in","volume":"16","author":"Harshman","year":"1970","journal-title":"UCLA Working Papers in Phonetics"},{"key":"B17","doi-asserted-by":"crossref","first-page":"770","DOI":"10.1109\/CVPR.2016.90","article-title":"\u201cDeep residual learning for image recognition,\u201d","volume-title":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"He","year":"2016"},{"key":"B18","doi-asserted-by":"publisher","first-page":"165","DOI":"10.1016\/j.neucom.2021.10.036","article-title":"Deep kronecker neural networks: a general framework for neural networks with adaptive activation functions","volume":"468","author":"Jagtap","year":"2022","journal-title":"Neurocomputing"},{"key":"B19","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1137\/21S1456522","article-title":"A tensor SVD-based classification algorithm applied to fmri data","volume":"15","author":"Keegan","year":"2022","journal-title":"SIAM Undergrad. Res. Online"},{"key":"B20","doi-asserted-by":"publisher","first-page":"545","DOI":"10.1016\/j.laa.2015.07.021","article-title":"Tensor-tensor products with invertible linear transforms","volume":"485","author":"Kernfeld","year":"2015","journal-title":"Linear Algebra Appl"},{"key":"B21","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1137\/110837711","article-title":"Third-order tensors as operators on matrices: a theoretical and computational framework with applications in imaging","volume":"34","author":"Kilmer","year":"2013","journal-title":"SIAM J. Matrix Anal. Appl"},{"key":"B22","doi-asserted-by":"publisher","first-page":"e2015851118","DOI":"10.1073\/pnas.2015851118","article-title":"Tensor-tensor algebra for optimal representation and compression of multiway data","volume":"118","author":"Kilmer","year":"2021","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"B23","doi-asserted-by":"publisher","first-page":"641","DOI":"10.1016\/j.laa.2010.09.020","article-title":"Factorization strategies for third-order tensors","volume":"435","author":"Kilmer","year":"2011","journal-title":"Linear Algebra Appl"},{"key":"B24","article-title":"\u201cAdam: a method for stochastic optimization,\u201d","author":"Kingma","year":"2015","journal-title":"3rd International Conference on Learning Representations, ICLR 2015, May 7-9, 2015, Conference Track Proceedings"},{"key":"B25","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1137\/07070111X","article-title":"Tensor decompositions and applications","volume":"51","author":"Kolda","year":"2009","journal-title":"SIAM Rev"},{"key":"B26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.48550\/arXiv.1707.08308","article-title":"Tensor regression networks","volume":"21","author":"Kossaifi","year":"2020","journal-title":"J. Mach. Learn. Res"},{"key":"B27","unstructured":"KrizhevskyA.\n            HintonG.\n          Learning multiple layers of features from tiny images2009"},{"key":"B28","unstructured":"\u201cImageNet classification with deep convolutional neural networks,\u201d\n            KrizhevskyA.\n            SutskeverI.\n            HintonG. E.\n          Advances in Neural Information Processing Systems, Vol. 252012"},{"key":"B29","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1214\/aoms\/1177729694","article-title":"On information and sufficiency","volume":"22","author":"Kullback","year":"1951","journal-title":"Ann. Math. Stat"},{"key":"B30","unstructured":"LeCunY.\n            CortesC.\n            BurgesC. J. C.\n          The MNIST Database of Handwritten Digits.1998"},{"key":"B31","article-title":"\u201cOptimal brain damage,\u201d","author":"LeCun","year":"1989","journal-title":"Advances in Neural Information Processing Systems, Volume 2"},{"key":"B32","doi-asserted-by":"publisher","first-page":"e2019E","DOI":"10.1029\/2019EA001037","article-title":"The tensor-based feature analysis of spatiotemporal field data with heterogeneity","volume":"7","author":"Li","year":"2020","journal-title":"Earth Space Sci"},{"key":"B33","doi-asserted-by":"publisher","first-page":"e2288","DOI":"10.1002\/nla.2288","article-title":"The tensor t- function: a definition for functions of third-order tensors","volume":"27","author":"Lund","year":"2020","journal-title":"Numer. Linear Algebra Appl"},{"key":"B34","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1007\/s10543-021-00877-w","article-title":"Randomized Kaczmarz for tensor linear systems","volume":"62","author":"Ma","year":"2022","journal-title":"BIT Numer. Math"},{"key":"B35","doi-asserted-by":"crossref","first-page":"729","DOI":"10.1137\/1.9781611976700.82","article-title":"\u201cDynamic graph convolutional networks using the tensor M-product,\u201d","volume-title":"Proceedings of the 2021 SIAM International Conference on Data Mining (SDM)","author":"Malik","year":"2021"},{"key":"B36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pcbi.1010212","article-title":"Dimensionality reduction of longitudinal omics data using modern tensor factorizations","volume":"18","author":"Mor","year":"2022","journal-title":"PLoS Comput. Biol"},{"key":"B37","author":"Newman","year":"2019","journal-title":"A Step in the Right Dimension: Tensor Algebra and Applications"},{"key":"B38","article-title":"\u201cImage classification using local tensor singular value decompositions,\u201d","volume-title":"2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP)","author":"Newman","year":"2018"},{"key":"B39","doi-asserted-by":"publisher","first-page":"1084","DOI":"10.1137\/19M1297026","article-title":"Nonnegative tensor patch dictionary approaches for image compression and deblurring applications","volume":"13","author":"Newman","year":"2020","journal-title":"SIAM J. Imaging Sci"},{"key":"B40","volume-title":"Neural Networks and Deep Learning","author":"Nielsen","year":"2018"},{"key":"B41","unstructured":"\u201cTensorizing neural networks,\u201d442450\n            NovikovA.\n            PodoprikhinD.\n            OsokinA.\n            VetrovD.\n          Advances in Neural Information Processing Systems 282015"},{"key":"B42","doi-asserted-by":"publisher","first-page":"18371","DOI":"10.1073\/pnas.0709146104","article-title":"A tensor higher-order singular value decomposition for integrative analysis of dna microarray data from different studies","volume":"104","author":"Omberg","year":"2007","journal-title":"Proc. Nat. Acad. Sci"},{"key":"B43","doi-asserted-by":"publisher","first-page":"2295","DOI":"10.1137\/090752286","article-title":"Tensor-train decomposition","volume":"33","author":"Oseledets","year":"2011","journal-title":"SIAM J. Sci. Comput"},{"key":"B44","article-title":"\u201cAutomatic differentiation in pytorch,\u201d","volume-title":"NIPS-W","author":"Paszke","year":"2017"},{"key":"B45","volume-title":"The Matrix Cookbook","author":"Petersen","year":"2012"},{"key":"B46","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1007\/978-3-319-24574-4_28","article-title":"\u201cU-net: convolutional networks for biomedical image segmentation,\u201d","author":"Ronneberger","year":"2015","journal-title":"Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015"},{"key":"B47","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"B48","unstructured":"\u201cFailures of gradient-based deep learning,\u201d\n            Shalev-ShwartzS.\n            ShamirO.\n            ShammahS.\n          37141866Proceedings of the 34th International Conference on Machine Learning2017"},{"key":"B49","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1007\/BF01990352","article-title":"Variable step size destabilizes the stromer\/leapfrog\/verlet method","volume":"33","author":"Skeel","year":"1993","journal-title":"BIT"},{"key":"B50","doi-asserted-by":"publisher","first-page":"1425","DOI":"10.1007\/s10543-016-0607-z","article-title":"A tensor-based dictionary learning approach to tomographic image reconstruction","volume":"56","author":"Soltani","year":"2016","journal-title":"BIT Numer. Math"},{"key":"B51","unstructured":"Tufts community appeal2023"},{"key":"B52","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/BF02289464","article-title":"Some mathematical notes on three-mode factor analysis","volume":"31","author":"Tucker","year":"1966","journal-title":"Psychometrika"},{"key":"B53","doi-asserted-by":"publisher","first-page":"447","DOI":"10.1007\/3-540-47969-4_30","article-title":"\u201cMultilinear analysis of image ensembles: tensorfaces,\u201d","author":"Vasilescu","year":"2002","journal-title":"Computer Vision"},{"key":"B54","author":"Wang","year":"2023","journal-title":"Tensor networks meet neural networks: A survey and future perspectives"},{"key":"B55","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2014.485","article-title":"\u201cNovel methods for multilinear data completion and denoising based on tensor-SVD,\u201d","volume-title":"2014 IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhang","year":"2014"}],"container-title":["Frontiers in Big Data"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdata.2024.1363978\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T10:33:55Z","timestamp":1721385235000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdata.2024.1363978\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,30]]},"references-count":55,"alternative-id":["10.3389\/fdata.2024.1363978"],"URL":"https:\/\/doi.org\/10.3389\/fdata.2024.1363978","relation":{},"ISSN":["2624-909X"],"issn-type":[{"value":"2624-909X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,30]]},"article-number":"1363978"}}