{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T15:42:54Z","timestamp":1759333374996,"version":"3.33.0"},"reference-count":22,"publisher":"MIT Press","issue":"2","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,1,21]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>In this article, we mainly study the depth and width of autoencoders consisting of rectified linear unit (ReLU) activation functions. An autoencoder is a layered neural network consisting of an encoder, which compresses an input vector to a lower-dimensional vector, and a decoder, which transforms the low-dimensional vector back to the original input vector exactly (or approximately). In a previous study, Melkman et al. (2023) studied the depth and width of autoencoders using linear threshold activation functions with binary input and output vectors. We show that similar theoretical results hold if autoencoders using ReLU activation functions with real input and output vectors are used. Furthermore, we show that it is possible to compress input vectors to one-dimensional vectors using ReLU activation functions, although the size of compressed vectors is trivially \u03a9(log n) for autoencoders with linear threshold activation functions, where n is the number of input vectors. We also study the cases of linear activation functions. The results suggest that the compressive power of autoencoders using linear activation functions is considerably limited compared with those using ReLU activation functions.<\/jats:p>","DOI":"10.1162\/neco_a_01729","type":"journal-article","created":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T21:23:41Z","timestamp":1733174621000},"page":"235-259","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":3,"title":["On the Compressive Power of Autoencoders With Linear and ReLU Activation Functions"],"prefix":"10.1162","volume":"37","author":[{"given":"Liangjie","family":"Sun","sequence":"first","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan"},{"name":"Department of Mathematics, University of Hong Kong, Hong Kong ljsun_seu@126.com"}]},{"given":"Chenyao","family":"Wu","sequence":"additional","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan wcy3442@gmail.com"}]},{"given":"Wai-Ki","family":"Ching","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of Hong Kong, Hong Kong wching@hku.hk"}]},{"given":"Tatsuya","family":"Akutsu","sequence":"additional","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan takutsu@kuicr.kyoto-u.ac.jp"}]}],"member":"281","published-online":{"date-parts":[[2025,1,21]]},"reference":[{"issue":"1","key":"2025012818240890300_bib1","first-page":"147","article-title":"A learning algorithm for Boltzmann machines","volume":"9","author":"Ackley","year":"1985","journal-title":"Cognitive Science"},{"key":"2025012818240890300_bib2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TNNLS.2023.3342818","article-title":"On the size and width of the decoder of a Boolean threshold autoencoder","volume":"99","author":"Akutsu","year":"2023","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"2025012818240890300_bib3","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1016\/j.neunet.2021.01.026","article-title":"A survey on modern trainable activation functions","volume":"138","author":"Apicella","year":"2021","journal-title":"Neural Networks"},{"key":"2025012818240890300_bib4","first-page":"37","article-title":"Autoencoders, unsupervised learning, and deep architectures","volume-title":"Proceedings of ICML Workshop on Unsupervised and Transfer Learning","author":"Baldi","year":"2012"},{"issue":"1","key":"2025012818240890300_bib5","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/0893-6080(89)90014-2","article-title":"Neural networks and principal component analysis: Learning from examples without local minima","volume":"2","author":"Baldi","year":"1989","journal-title":"Neural Networks"},{"key":"2025012818240890300_bib6","first-page":"666","article-title":"Shallow vs. deep sum-product networks","volume-title":"Advances in neural information processing systems","author":"Delalleau","year":"2011"},{"journal-title":"Tutorial on variational autoencoders","year":"2016","author":"Doersch","key":"2025012818240890300_bib7"},{"key":"2025012818240890300_bib8","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1016\/j.neucom.2022.06.111","article-title":"Activation functions in deep learning: A comprehensive survey and benchmark","volume":"503","author":"Dubey","year":"2022","journal-title":"Neurocomputing"},{"issue":"2","key":"2025012818240890300_bib9","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","article-title":"Automatic chemical design using a data-driven continuous representation of molecules","volume":"4","author":"G\u00f6mez-Bombarelli","year":"2018","journal-title":"ACS Central Science"},{"issue":"5786","key":"2025012818240890300_bib10","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"2025012818240890300_bib11","doi-asserted-by":"crossref","DOI":"10.1016\/j.neucom.2023.126520","article-title":"Additive autoencoder for dimension estimation","volume":"551","author":"K\u00e4rkk\u00e4inen","year":"2023","journal-title":"Neurocomputing"},{"issue":"4","key":"2025012818240890300_bib12","doi-asserted-by":"publisher","first-page":"1019","DOI":"10.1162\/neco_a_01486","article-title":"Comparison of the representational power of random forests, binary decision diagrams, and neural networks","volume":"34","author":"Kumano","year":"2022","journal-title":"Neural Computation"},{"issue":"7","key":"2025012818240890300_bib13","doi-asserted-by":"crossref","DOI":"10.3390\/e23070862","article-title":"Information flows of diverse autoencoders","volume":"23","author":"Lee","year":"2021","journal-title":"Entropy"},{"issue":"2","key":"2025012818240890300_bib14","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1109\/TNNLS.2021.3104646","article-title":"On the compressive power of Boolean threshold autoencoders","volume":"34","author":"Melkman","year":"2023","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"2025012818240890300_bib15","first-page":"2924","article-title":"On the number of linear regions of deep neural networks","volume-title":"Advances in neural information processing systems","author":"Montufar","year":"2014"},{"key":"2025012818240890300_bib16","first-page":"1","article-title":"On the information plane of autoencoders","volume-title":"Proceedings of the 2020 International Joint Conference on Neural Networks","author":"Tapia","year":"2020"},{"journal-title":"Recent advances in autoencoder-based representation learning","year":"2018","author":"Tschannen","key":"2025012818240890300_bib17"},{"key":"2025012818240890300_bib18","first-page":"468","article-title":"Fast methods for estimating the numerical rank of large matrices","volume-title":"Proceedings of the 33rd International Conference on Machine Learning","author":"Ubaru","year":"2016"},{"issue":"4","key":"2025012818240890300_bib19","doi-asserted-by":"publisher","first-page":"1004","DOI":"10.1137\/20M1314884","article-title":"Memory capacity of neural networks with threshold and rectified linear unit activations","volume":"2","author":"Vershynin","year":"2020","journal-title":"SIAM Journal on Mathematics of Data Science"},{"key":"2025012818240890300_bib20","doi-asserted-by":"publisher","first-page":"104","DOI":"10.1016\/j.neunet.2019.05.003","article-title":"Understanding autoencoders with information theoretic concepts","volume":"117","author":"Yu","year":"2019","journal-title":"Neural Networks"},{"key":"2025012818240890300_bib21","first-page":"15532","article-title":"Small ReLU networks are powerful memorizers: A tight analysis of memorization capacity","volume-title":"Advances in neural information processing systems","author":"Yun","year":"2019"},{"key":"2025012818240890300_bib22","article-title":"Understanding deep learning requires rethinking generalization","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Zhang","year":"2017"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/37\/2\/235\/2482171\/neco_a_01729.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/37\/2\/235\/2482171\/neco_a_01729.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,28]],"date-time":"2025-01-28T18:24:28Z","timestamp":1738088668000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/37\/2\/235\/125502\/On-the-Compressive-Power-of-Autoencoders-With"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,21]]},"references-count":22,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,1,21]]},"published-print":{"date-parts":[[2025,1,21]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01729","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"type":"print","value":"0899-7667"},{"type":"electronic","value":"1530-888X"}],"subject":[],"published-other":{"date-parts":[[2025,2]]},"published":{"date-parts":[[2025,1,21]]}}}