{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:00:58Z","timestamp":1760238058209,"version":"build-2065373602"},"reference-count":42,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T00:00:00Z","timestamp":1660348800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>In this paper, we study the learnability of the Boolean inner product by a systematic simulation study. The family of the Boolean inner product function is known to be representable by neural networks of threshold neurons of depth 3 with only 2n+1 units (n the input dimension)\u2014whereas an exact representation by a depth 2 network cannot possibly be of polynomial size. This result can be seen as a strong argument for deep neural network architectures. In our study, we found that this depth 3 architecture of the Boolean inner product is difficult to train, much harder than the depth 2 network, at least for the small input size scenarios n\u226416. Nonetheless, the accuracy of the deep architecture increased with the dimension of the input space to 94% on average, which means that multiple restarts are needed to find the compact depth 3 architecture. Replacing the fully connected first layer by a partially connected layer (a kind of convolutional layer sparsely connected with weight sharing) can significantly improve the learning performance up to 99% accuracy in simulations. Another way to improve the learnability of the compact depth 3 representation of the inner product could be achieved by adding just a few additional units into the first hidden layer.<\/jats:p>","DOI":"10.3390\/e24081117","type":"journal-article","created":{"date-parts":[[2022,8,15]],"date-time":"2022-08-15T01:47:21Z","timestamp":1660528041000},"page":"1117","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Learnability of the Boolean Innerproduct in Deep Neural Networks"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8001-3703","authenticated-orcid":false,"given":"Mehmet","family":"Erdal","sequence":"first","affiliation":[{"name":"Institute of Neural Information Processing, Ulm University, 89081 Ulm, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5118-0812","authenticated-orcid":false,"given":"Friedhelm","family":"Schwenker","sequence":"additional","affiliation":[{"name":"Institute of Neural Information Processing, Ulm University, 89081 Ulm, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep Learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_2","unstructured":"Schwenker, F., Kestler, H., Palm, G., and Hoher, M. (1994, January 2\u20135). Similarities of LVQ and RBF learning-a survey of learning rules and the application to the classification of signals from high-resolution electrocardiography. Proceedings of the Man and Cybernetics, San Antonio, TX, USA."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"125269","DOI":"10.1109\/ACCESS.2021.3110911","article-title":"Next-generation neural networks: Capsule networks with routing-by-agreement for text classification","volume":"9","author":"Steur","year":"2021","journal-title":"IEEE Access"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Bottou, L., Chapelle, O., DeCoste, D., and Weston, J. (2007). Scaling Learning Algorithms towards AI. Large Scale Kernel Machines, MIT Press.","DOI":"10.7551\/mitpress\/7496.001.0001"},{"key":"ref_5","first-page":"1","article-title":"Exploring Strategies for Training Deep Neural Networks","volume":"10","author":"Larochelle","year":"2009","journal-title":"J. Mach. Learn. Res."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/3446776","article-title":"Understanding Deep Learning (Still) Requires Rethinking Generalization","volume":"64","author":"Zhang","year":"2021","journal-title":"Commun. ACM"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Hastad, J., and Goldmann, M. (1990, January 22\u201324). On the power of small-depth threshold circuits. Proceedings of the 31st Annual Symposium on Foundations of Computer Science, Washington, DC, USA.","DOI":"10.1109\/FSCS.1990.89582"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Furst, M., Saxe, J., and Sipser, M. (1981, January 28\u201330). Parity Circuits and the Polynomial-Time Hierarchy. Proceedings of the 22nd Annual Symposium on Foundations of Computer Science, Washington, DC, USA.","DOI":"10.1109\/SFCS.1981.35"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Parberry, I. (1994). Circuit Complexity and Neural Networks, MIT Press.","DOI":"10.7551\/mitpress\/1836.001.0001"},{"key":"ref_10","unstructured":"Telgarsky, M. (2015). Representation Benefits of Deep Feedforward Networks. arXiv."},{"key":"ref_11","unstructured":"Eldan, R., and Shamir, O. (2016, January 23\u201326). The Power of Depth for Feedforward Neural Networks. Proceedings of the 29th Annual Conference on Learning Theory, New York, NY, USA."},{"key":"ref_12","unstructured":"Lu, Z., Pu, H., Wang, F., Hu, Z., and Wang, L. (2017). The Expressive Power of Neural Networks: A View from the Width. Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_13","unstructured":"Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., and Garnett, R. (2016). Optimal Architectures in a Solvable Model of Deep Networks. Proceedings of the Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1145\/2347736.2347755","article-title":"A Few Useful Things to Know about Machine Learning","volume":"55","author":"Domingos","year":"2012","journal-title":"Commun. ACM"},{"key":"ref_15","unstructured":"Anthony, M., and Bartlett, P. (2009). Neural Network Learning: Theoretical Foundations, Cambridge University Press. [1st ed.]."},{"key":"ref_16","unstructured":"Ra\u00fal, R. (1996). Neural Networks: A Systematic Introduction, Springer."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2200000006","article-title":"Learning Deep Architectures for AI","volume":"2","author":"Bengio","year":"2009","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Mhaskar, H., Liao, Q., and Poggio, T. (2017, January 4\u20139). When and Why Are Deep Networks Better than Shallow Ones?. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.10913"},{"key":"ref_19","unstructured":"Maxwell, N., and Andrew, S. (2018). Are Efficient Deep Representations Learnable?. arXiv."},{"key":"ref_20","unstructured":"Anderson, R.T., Pedro, H.C.A., Jo\u00e3o Marcos, F., M\u00e1rcio, N., Lamb, L., and Moshe, Y.V. (2020). Understanding Boolean Function Learnability on Deep Neural Networks. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Sakamoto, N. (2019, January 8\u201311). Classifying Three-input Boolean Functions by Neural Networks. Proceedings of the 2019 20th IEEE\/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel\/Distributed Computing (SNPD), Toyama, Japan.","DOI":"10.1109\/SNPD.2019.8935697"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1016\/j.neucom.2006.01.025","article-title":"Generalization ability of Boolean functions implemented in feedforward neural networks","volume":"70","author":"Franco","year":"2006","journal-title":"Neurocomputing"},{"key":"ref_23","unstructured":"Steinbach, B., and Kohut, R. (2002, January 19\u201320). Neural Networks\u2014A Model of Boolean Functions. Proceedings of the 5th International Workshop on Boolean Problems, Freiberg, Germany."},{"key":"ref_24","unstructured":"Franco, L., and Anthony, M. (2004, January 25\u201329). On a generalization complexity measure for Boolean functions. Proceedings of the 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541), Budapest, Hungary."},{"key":"ref_25","unstructured":"Daniely, A., and Malach, E. (2020, January 6\u201312). Learning Parities with Neural Networks. Proceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY, USA."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/0004-3702(89)90049-0","article-title":"Connectionist Learning Procedures","volume":"40","author":"Hinton","year":"1989","journal-title":"Artif. Intell."},{"key":"ref_27","first-page":"39","article-title":"Scaling Relationships in Back-Propagation Learning","volume":"2","author":"Tesauro","year":"1988","journal-title":"Complex Syst."},{"key":"ref_28","unstructured":"Martens, J., Chattopadhya, A., Pitassi, T., and Zemel, R. (2013). On the Representational Efficiency of Restricted Boltzmann Machines. Proceedings of the Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1631","DOI":"10.1162\/neco.2008.04-07-510","article-title":"Representational Power of Restricted Boltzmann Machines and Deep Belief Networks","volume":"20","author":"Bengio","year":"2008","journal-title":"Neural Comput."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on Image Data Augmentation for Deep Learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Yorioka, D., Kang, H., and Iwamura, K. (2020, January 13\u201316). Data Augmentation For Deep Learning Using Generative Adversarial Networks. Proceedings of the 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), Kobe, Japan.","DOI":"10.1109\/GCCE50665.2020.9291963"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"16884","DOI":"10.1038\/s41598-019-52737-x","article-title":"Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks","volume":"9","author":"Sandfort","year":"2019","journal-title":"Sci. Rep."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Anthony, M. (2001). Discrete Mathematics of Neural Networks: Selected Topics, Society for Industrial and Applied Mathematics.","DOI":"10.1137\/1.9780898718539"},{"key":"ref_34","first-page":"9","article-title":"Efficient BackProp","volume":"Volume 7700","author":"LeCun","year":"2012","journal-title":"Neural Networks: Tricks of the Trade"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/BF02551274","article-title":"Approximation by superpositions of a sigmoidal function","volume":"2","author":"Cybenko","year":"1989","journal-title":"Math. Control. Signals Syst. (MCSS)"},{"key":"ref_36","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press. Available online: http:\/\/www.deeplearningbook.org."},{"key":"ref_37","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2022, August 10). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: https:\/\/www.tensorflow.org."},{"key":"ref_38","unstructured":"Sutskever, I., Martens, J., Dahl, G., and Hinton, G. (2013, January 17\u201319). On the Importance of Initialization and Momentum in Deep Learning. Proceedings of the 30th International Conference on International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"55","DOI":"10.3389\/fncom.2019.00055","article-title":"Back-Propagation Learning in Deep Spike-By-Spike Networks","volume":"13","author":"Rotermund","year":"2019","journal-title":"Front. Comput. Neurosci."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Jansen, B., and Nakayama, K. (2006, January 16\u201321). An Adaptive Penalty-Based Learning Extension for Backpropagation and its Variants. Proceedings of the 2006 IEEE International Joint Conference on Neural Network Proceedings, Vancouver, BC, Canada.","DOI":"10.1109\/IJCNN.2006.247341"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1306","DOI":"10.1109\/72.963767","article-title":"Generalization properties of modular networks: Implementing the parity function","volume":"12","author":"Franco","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1023\/A:1021726007566","article-title":"The Minimum Number of Errors in the N-Parity and its Solution with an Incremental Neural Network","volume":"16","author":"Aguilar","year":"2002","journal-title":"Neural Process. Lett."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/8\/1117\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:08:17Z","timestamp":1760141297000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/8\/1117"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,13]]},"references-count":42,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,8]]}},"alternative-id":["e24081117"],"URL":"https:\/\/doi.org\/10.3390\/e24081117","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2022,8,13]]}}}