{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:09:04Z","timestamp":1760238544619,"version":"build-2065373602"},"reference-count":40,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2022,8,27]],"date-time":"2022-08-27T00:00:00Z","timestamp":1661558400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>The recent boom of artificial Neural Networks (NN) has shown that NN can provide viable solutions to a variety of problems. However, their complexity and the lack of efficient interpretation of NN architectures (commonly considered black box techniques) has adverse effects on the optimization of each NN architecture. One cannot simply use a generic topology and have the best performance in every application field, since the network topology is commonly fine-tuned to the problem\/dataset in question. In this paper, we introduce a novel method of computationally assessing the complexity of the dataset. The NN is treated as an information channel, and thus information theory is used to estimate the optimal number of neurons for each layer, reducing the memory and computational load, while achieving the same, if not greater, accuracy. Experiments using common datasets confirm the theoretical findings, and the derived algorithm seems to improve the performance of the original architecture.<\/jats:p>","DOI":"10.3390\/info13090405","type":"journal-article","created":{"date-parts":[[2022,8,29]],"date-time":"2022-08-29T03:34:56Z","timestamp":1661744096000},"page":"405","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["OptiNET\u2014Automatic Network Topology Optimization"],"prefix":"10.3390","volume":"13","author":[{"given":"Andreas","family":"Maniatopoulos","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering Department, Democritus University of Thrace, 69100 Komotini, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paraskevi","family":"Alvanaki","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, Democritus University of Thrace, 69100 Komotini, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0898-6102","authenticated-orcid":false,"given":"Nikolaos","family":"Mitianoudis","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, Democritus University of Thrace, 69100 Komotini, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Siu, K., Stuart, D.M., Mahmoud, M., and Moshovos, A. (October, January 30). Memory Requirements for Convolutional Neural Network Hardware Accelerators. Proceedings of the 2018 IEEE International Symposium on Workload Characterization (IISWC), Raleigh, NC, USA.","DOI":"10.1109\/IISWC.2018.8573527"},{"key":"ref_2","unstructured":"Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., Xiao, T., Xu, B., Zhang, C., and Zhang, Z. (2015), January 7\u201312). MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. Proceedings of the Neural Information Processing Systems, Workshop on Machine Learning Systems, Montreal, QC, Canada."},{"key":"ref_3","unstructured":"Gruslys, A., Munos, R., Danihelka, I., Lanctot, M., and Graves, A. (2016, January 5\u201310). Memory-Efficient Backpropagation Through Time. Proceedings of the NIPS\u201916: 30th International Conference on Neural Information Processing Systems, Barcelona Spain."},{"key":"ref_4","unstructured":"Diamos, G., Sengupta, S., Catanzaro, B., Chrzanowski, M., Coates, A., Elsen, E., Engel, J., Hannun, A., and Satheesh, S. (2016, January 19\u201324). Persistent RNNs: Stashing recurrent weights on-chip. Proceedings of the ICML\u201916: 33rd International Conference on International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_5","unstructured":"Hagan, M., Demuth, H.B., Beale, M.H., and De Jesus, O. (2014). Neural Network Design, Martin Hagan. [2nd ed.]. Available online: https:\/\/hagan.okstate.edu\/nnd.html."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Bishop, C. (1995). Neural Networks for Pattern Recognition, Oxford University Press.","DOI":"10.1093\/oso\/9780198538493.001.0001"},{"key":"ref_7","unstructured":"Theodoridis, S. (2015). Machine Learning: A Bayesian Perspective, Academic Press. [1st ed.]."},{"key":"ref_8","unstructured":"Heaton, J. (2008). Introduction to Neural Networks for Java, Heaton Research, Inc.. [2nd ed.]."},{"key":"ref_9","unstructured":"Han, S., Mao, H., and Dally, W.J. (2016, January 2\u20134). Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico. Available online: http:\/\/arxiv.org\/abs\/1510.00149."},{"key":"ref_10","unstructured":"Lee, N., Ajanthan, T., and Torr, P.H.S. (2019, January 6\u20139). Snip: Single-shot network pruning based on connection sensitivity. Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA. Available online: https:\/\/openreview.net\/forum?id=B1VZqjAcYX."},{"key":"ref_11","unstructured":"Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf, H.P. (2016). Pruning filters for efficient convnets. arXiv."},{"key":"ref_12","unstructured":"Frankle, J., and Carbin, M. (2019, January 6\u20139). The lottery ticket hypothesis: Finding sparse, trainable neural networks. Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA. Available online: https:\/\/openreview.net\/forum?id=rJl-b3RcF7."},{"key":"ref_13","unstructured":"Liu, Z., Sun, M., Zhou, T., Huang, G., and Darrell, T. (2019, January 6\u20139). Re-thinking the value of network pruning. Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA. Available online: https:\/\/openreview.net\/forum?id=rJlnB3C5Ym."},{"key":"ref_14","unstructured":"Han, S., Pool, J., Tran, J., and Dally, W. (2015, January 7\u201312). Learning both weights and connections for efficient neural network. Proceedings of the NIPS\u201915: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_15","unstructured":"Gale, T., Elsen, E., and Hooker, S. (2016). The state of sparsity in deep neural networks. arXiv."},{"key":"ref_16","unstructured":"Frankle, J., Dziugaite, G.K., Roy, D.M., and Carbin, M. (2019). The lottery ticket hypothesis at scale. arXiv."},{"key":"ref_17","unstructured":"Cover, T.M., and Thomas, J.A. (2006). Elements of Information Theory, Wiley-Interscience. [2nd ed.]."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1016\/S0304-3975(98)00075-9","article-title":"On Tables of Random Number","volume":"207","author":"Kolmogorov","year":"1998","journal-title":"Theor. Comput. Sci."},{"key":"ref_19","first-page":"1","article-title":"Three Approaches to the Quantitative Definition of Information","volume":"1","author":"Kolmogorov","year":"1965","journal-title":"Probl. Inform. Transm."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"662","DOI":"10.1109\/TIT.1968.1054210","article-title":"Logical basis for information theory and probability theory","volume":"14","author":"Kolmogorov","year":"1968","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1109\/TIT.1972.1054753","article-title":"An algorithm for computing the capacity of arbitrary discrete memoryless channels","volume":"18","author":"Arimoto","year":"1972","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_22","first-page":"19","article-title":"Generalized Kolmogorov complexity and duality in theory of computations","volume":"25","author":"Burgin","year":"1982","journal-title":"Not. Russ. Acad. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.tcs.2013.07.009","article-title":"Conditional Kolmogorov complexity and universal probability","volume":"501","year":"2013","journal-title":"Theor. Comput. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Kaltchenko, A. (2004). Algorithms for Estimating Information Distance with Application to Bioinformatics and Linguistics. arXiv.","DOI":"10.1109\/CCECE.2004.1347695"},{"key":"ref_25","unstructured":"Solomonoff, R. (1960). A Preliminary Report on a General Theory of Inductive Inference, Zator Company. Report V-131; Revision Published November 1960."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Rissanen, J. (2007). Information and Complexity in Statistical Modeling, Springer.","DOI":"10.1007\/978-0-387-68812-1"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"460","DOI":"10.1109\/TIT.1972.1054855","article-title":"Computation of channel capacity and rate-distortion functions","volume":"18","author":"Blahut","year":"1972","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_28","unstructured":"Vontobel, P.O. (July, January 29). A Generalized Blahut\u2013Arimoto Algorithm. Proceedings of the IEEE International Symposium on Information Theory, Yokohama, Japan."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1109\/TIT.2012.2214202","article-title":"Extension of the Blahut\u2013Arimoto Algorithm for Maximizing Directed Information","volume":"59","author":"Naiss","year":"2013","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Jetka, T., Nienaltowski, K., Winarski, T., Blonski, S., and Komorowski, M. (2019). Information-theoretic analysis of multivariate single-cell signaling responses. PLoS Comput. Biol., 15.","DOI":"10.1371\/journal.pcbi.1007132"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3149","DOI":"10.1109\/TIT.2010.2048452","article-title":"Squeezing the Arimoto-Blahut Algorithm for Faster Convergence","volume":"56","author":"Yu","year":"2010","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_32","unstructured":"Krizhevsky, A. (2022, July 24). Learning Multiple Layers of Features from Tiny Images. Available online: https:\/\/www.cs.toronto.edu\/~kriz\/learning-features-2009-TR.pdf."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/icesjms\/fsx109","article-title":"Automatic fish species classification in underwater videos: Exploiting pretrained deep neural network models to compensate for limited labelled data","volume":"75","author":"Siddiqui","year":"2018","journal-title":"ICES J. Mar. Sci."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A Mathematical Theory of Communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1109\/MSP.2012.2211477","article-title":"The mnist database of handwritten digit images for machine learning research","volume":"29","author":"Deng","year":"2012","journal-title":"IEEE Signal Proc."},{"key":"ref_36","unstructured":"(2022, July 24). Support Vector Machines Speed Pattern Recognition\u2014Vision Systems Design. Available online: https:\/\/www.vision-systems.com\/home\/article\/16737424\/support-vector-machines-speed-pattern-recognition."},{"key":"ref_37","unstructured":"LeCun, Y., Cortez, C., and Burges, C.C.J. (2022, July 24). The MNIST Handwritten Digit Database. Yann LeCun\u2019s Website. Available online: http:\/\/yann.lecun.com."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"971","DOI":"10.1016\/j.imavis.2004.03.008","article-title":"Improved method of handwritten digit recognition tested on MNIST database","volume":"22","author":"Kussul","year":"2004","journal-title":"Image Vis. Comput."},{"key":"ref_39","unstructured":"Belilovsky, E., Eickenberg, M., and Oyallon, E. (2019). Greedy Layerwise Learning Can Scale to ImageNet. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"847","DOI":"10.46300\/9106.2020.14.110","article-title":"Artificial Neural Network Performance Boost using Probabilistic Recovery with Fast Cascade Training","volume":"14","author":"Maniatopoulos","year":"2020","journal-title":"Int. J. Circuits Syst. Signal Process."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/13\/9\/405\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:16:21Z","timestamp":1760141781000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/13\/9\/405"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,27]]},"references-count":40,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["info13090405"],"URL":"https:\/\/doi.org\/10.3390\/info13090405","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2022,8,27]]}}}