{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,11]],"date-time":"2026-06-11T22:40:20Z","timestamp":1781217620395,"version":"3.54.1"},"reference-count":23,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2021,12,9]],"date-time":"2021-12-09T00:00:00Z","timestamp":1639008000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>In neural networks, a vital component in the learning and inference process is the activation function. There are many different approaches, but only nonlinear activation functions allow such networks to compute non-trivial problems by using only a small number of nodes, and such activation functions are called nonlinearities. With the emergence of deep learning, the need for competent activation functions that can enable or expedite learning in deeper layers has emerged. In this paper, we propose a novel activation function, combining many features of successful activation functions, achieving 2.53% higher accuracy than the industry standard ReLU in a variety of test cases.<\/jats:p>","DOI":"10.3390\/info12120513","type":"journal-article","created":{"date-parts":[[2021,12,9]],"date-time":"2021-12-09T21:46:58Z","timestamp":1639086418000},"page":"513","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":69,"title":["Learnable Leaky ReLU (LeLeLU): An Alternative Accuracy-Optimized Activation Function"],"prefix":"10.3390","volume":"12","author":[{"given":"Andreas","family":"Maniatopoulos","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, Democritus University of Greece, 67100 Xanthi, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0898-6102","authenticated-orcid":false,"given":"Nikolaos","family":"Mitianoudis","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Democritus University of Greece, 67100 Xanthi, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,9]]},"reference":[{"key":"ref_1","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on International Conference on Machine Learning, 2010 ICML\u201910, Haifa, Israel."},{"key":"ref_2","unstructured":"Maas, A.L., Hannun, A.Y., and Ng, A.Y. (2013, January 16\u201321). Rectifier nonlinearities improve neural network acoustic models. Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_4","unstructured":"Hendrycks, D., and Gimpel, K. (2016). Gaussian Error Linear Units (GELUs). arXiv."},{"key":"ref_5","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv."},{"key":"ref_6","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_7","unstructured":"Dugas, C., Bengio, Y., B\u00e9lisle, F., Nadeau, C., and Garcia, R. (2001). Incorporating second-order functional knowledge for better option pricing. Advances in Neural Information Processing Systems, The MIT Press."},{"key":"ref_8","unstructured":"Glorot, X., Bordes, A., and Bengio, Y. (2011, January 11\u201313). Deep sparse rectifier neural networks. Proceedings of the International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA."},{"key":"ref_9","unstructured":"Clevert, D.-A., Unterthiner, T., and Hochreiter, S. (2015). Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). arXiv."},{"key":"ref_10","unstructured":"Klambauer, G., Unterthiner, T., Mayr, A., and Hochreiter, S. (2017, January 4\u20139). Self-normalizing neural networks. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_11","unstructured":"Courbariaux, M., Bengio, Y., and David, J.-P. (2015, January 7\u201312). BinaryConnect: Training deep neural networks with binary weights during propagations. Proceedings of the NIPS\u201915: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_12","unstructured":"Ramachandran, P., Zoph, B., and Le, Q.V. (2017). Swish: A Self-Gated Activation Function. arXiv."},{"key":"ref_13","unstructured":"Misra, D.M. (2019). A self regularized non-monotonic neural activation function. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"114048","DOI":"10.1016\/j.eswa.2020.114048","article-title":"Ensemble of convolutional neural networks trained with different activation functions","volume":"166","author":"Maguolo","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_15","unstructured":"Liang, S., Lyu, L., Wang, C., and Yang, H. (2021). Reproducing Activation Function for Deep Learning. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Zhu, Z., and Zhong, Z. (2021). Learning specialized activation functions with the Piecewise Linear Unit. arXiv.","DOI":"10.1109\/ICCV48922.2021.01188"},{"key":"ref_17","unstructured":"Shridhar, K., Lee, J., Hayashi, H., Mehta, P., Iwana, B.K., Kang, S., Uchida, S., Ahmed, S., and Dengel, A. (2020). ProbAct: A Probabilistic Activation Function for Deep Neural Networks. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Bingham, G., and Miikkulainen, R. (2021). Discovering Parametric Activation Functions. arXiv.","DOI":"10.1016\/j.neunet.2022.01.001"},{"key":"ref_19","first-page":"19","article-title":"Generalized Kolmogorov complexity and duality in theory of computations","volume":"25","author":"Burgin","year":"1982","journal-title":"Not. Russ. Acad. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Kaltchenko, A. (2004). Algorithms for Estimating Information Distance with Application to Bioinformatics and Linguistics. arXiv.","DOI":"10.1109\/CCECE.2004.1347695"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.tcs.2013.07.009","article-title":"Conditional Kolmogorov complexity and universal probability","volume":"501","year":"2013","journal-title":"Theor. Comput. Sci."},{"key":"ref_22","unstructured":"Solomonoff, R. (1960). A Preliminary Report on a General Theory of Inductive Inference, Office of Scientific Research, United States Air Force. Report V-131."},{"key":"ref_23","unstructured":"Jorma, R. (2007). Information and Complexity in Statistical Modeling, Springer."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/12\/12\/513\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:43:57Z","timestamp":1760168637000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/12\/12\/513"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,9]]},"references-count":23,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["info12120513"],"URL":"https:\/\/doi.org\/10.3390\/info12120513","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,9]]}}}