{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,25]],"date-time":"2026-01-25T00:18:07Z","timestamp":1769300287435,"version":"3.49.0"},"reference-count":38,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,3,14]],"date-time":"2020-03-14T00:00:00Z","timestamp":1584144000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The increasing size of modern datasets combined with the difficulty of obtaining real label information (e.g., class) has made semi-supervised learning a problem of considerable practical importance in modern data analysis. Semi-supervised learning is supervised learning with additional information on the distribution of the examples or, simultaneously, an extension of unsupervised learning guided by some constraints. In this article we present a methodology that bridges between artificial neural network output vectors and logical constraints. In order to do this, we present a semantic loss function and a generalized entropy loss function (R\u00e9nyi entropy) that capture how close the neural network is to satisfying the constraints on its output. Our methods are intended to be generally applicable and compatible with any feedforward neural network. Therefore, the semantic loss and generalized entropy loss are simply a regularization term that can be directly plugged into an existing loss function. We evaluate our methodology over an artificially simulated dataset and two commonly used benchmark datasets which are MNIST and Fashion-MNIST to assess the relation between the analyzed loss functions and the influence of the various input and tuning parameters on the classification accuracy. The experimental evaluation shows that both losses effectively guide the learner to achieve (near-) state-of-the-art results on semi-supervised multiclass classification.<\/jats:p>","DOI":"10.3390\/e22030334","type":"journal-article","created":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T09:27:41Z","timestamp":1584437261000},"page":"334","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["Semantic and Generalized Entropy Loss Functions for Semi-Supervised Deep Learning"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6953-8907","authenticated-orcid":false,"given":"Krzysztof","family":"Gajowniczek","sequence":"first","affiliation":[{"name":"Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-776 Warsaw, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yitao","family":"Liang","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of California, Los Angeles, CA 90095, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tal","family":"Friedman","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of California, Los Angeles, CA 90095, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tomasz","family":"Z\u0105bkowski","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-776 Warsaw, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3434-2503","authenticated-orcid":false,"given":"Guy","family":"Van den Broeck","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of California, Los Angeles, CA 90095, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,3,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Gajowniczek, K., Or\u0142owski, A., and Z\u0105bkowski, T. (2018). Simulation Study on the Application of the Generalized Entropy Concept in Artificial Neural Networks. Entropy, 20.","DOI":"10.3390\/e20040249"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Gajowniczek, K., Z\u0105bkowski, T., and Sodenkamp, M. (2018). Revealing Household Characteristics from Electricity Meter Data with Grade Analysis and Machine Learning Algorithms. Appl. Sci., 8.","DOI":"10.3390\/app8091654"},{"key":"ref_3","unstructured":"Sadarangani, A., and Jivani, A. (2016). A survey of semi-Supervised learning. Int. J. Eng. Sci. Res. Technol., 5."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Nafkha, R., Gajowniczek, K., and Z\u0105bkowski, T. (2018). Do Customers Choose Proper Tariff? Empirical Analysis Based on Polish Data Using Unsupervised Techniques. Energies, 11.","DOI":"10.3390\/en11030514"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Prakash, V.J., and Nithya, D.L. (2014). A survey on semi-supervised learning techniques. arXiv.","DOI":"10.14445\/22312803\/IJCTT-V8P105"},{"key":"ref_6","unstructured":"Xu, J., Zhang, Z., Friedman, T., Liang, Y., and Van den Broeck, G. (2018, January 10\u201315). A semantic loss function for deep learning with symbolic knowledge. Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden."},{"key":"ref_7","unstructured":"Xu, J., Zhang, Z., Friedman, T., Liang, Y., and Van den Broeck, G. (2017, January 4\u20139). A Semantic Loss Function for Deep Learning Under Weak Supervision. Proceedings of the NIPS 2017 Workshop on Learning with Limited Labeled Data: Weak Supervision and Beyond, Long Beach, CA, USA."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.patrec.2018.02.010","article-title":"Deep learning for sensor-based activity recognition: A survey","volume":"119","author":"Wang","year":"2019","journal-title":"Pattern Recognit. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1979","DOI":"10.1109\/TPAMI.2018.2858821","article-title":"Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning","volume":"41","author":"Miyato","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","first-page":"1","article-title":"Binarsity: A penalization for one-Hot encoded features in linear supervised learning","volume":"20","author":"Alaya","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_13","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A Mathematical Theory of Communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J."},{"key":"ref_15","unstructured":"R\u00e9nyi, A. (1961). On measures of information and entropy. the fourth Berkeley Symposium on Mathematics, Statistics and Probability, University of California Press."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Amig\u00f3, J., Balogh, S., and Hern\u00e1ndez, S. (2018). A Brief Review of Generalized Entropies. Entropy, 20.","DOI":"10.3390\/e20110813"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"A:38","DOI":"10.12693\/APhysPolA.127.A-38","article-title":"Q-Entropy Approach to Selecting High Income Households","volume":"127","author":"Gajowniczek","year":"2015","journal-title":"Acta Phys. Pol. A"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"971","DOI":"10.12693\/APhysPolA.129.971","article-title":"Entropy Based Trees to Support Decision Making for Customer Churn Management","volume":"129","author":"Gajowniczek","year":"2016","journal-title":"Acta Phys. Pol. A"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-Based learning applied to document recognition","volume":"86","author":"Lecun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_20","unstructured":"Xiao, H., Rasul, K., and Vollgraf, R. (2017). Fashion-Mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.patrec.2013.10.017","article-title":"Pattern classification and clustering: A review of partially supervised learning approaches","volume":"37","author":"Schwenker","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"van Engelen, J.E., and Hoos, H.H. (2019). A survey on semi-Supervised learning. Mach. Learn., 1\u201368.","DOI":"10.1007\/s10994-019-05855-6"},{"key":"ref_23","unstructured":"Bengio, Y., Lee, D.H., Bornschein, J., Mesnard, T., and Lin, Z. (2015). Towards biologically plausible deep learning. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"94","DOI":"10.3389\/fncom.2016.00094","article-title":"Toward an Integration of Deep Learning and Neuroscience","volume":"10","author":"Marblestone","year":"2016","journal-title":"Front. Comput. Neurosci."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., and Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics. J. Big Data, 2.","DOI":"10.1186\/s40537-014-0007-7"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.neucom.2016.12.038","article-title":"A survey of deep neural network architectures and their applications","volume":"234","author":"Liu","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_27","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201325). Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th international conference on machine learning (ICML), Haifa, Israel."},{"key":"ref_28","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_29","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France."},{"key":"ref_30","unstructured":"Darwiche, A. (2011, January 16\u201322). SDD: A new canonical representation of propositional knowledge bases. Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"082206","DOI":"10.1063\/1.4892761","article-title":"Relating different quantum generalizations of the conditional R\u00e9nyi entropy","volume":"55","author":"Tomamichel","year":"2014","journal-title":"J. Math. Phys."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"6801","DOI":"10.1109\/TIT.2014.2357799","article-title":"On the Conditional R\u00e9nyi Entropy","volume":"60","author":"Fehr","year":"2014","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_33","unstructured":"The R Development Core Team (2014). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing."},{"key":"ref_34","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., and Ghemawat, S. (2016). Tensorflow: Large-Scale machine learning on heterogeneous distributed systems. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Rasmus, A., Berglund, M., Honkala, M., Valpola, H., and Raiko, T. (2015, January 7\u201312). Semi-Supervised learning with ladder networks. Proceedings of the Neural Information Processing Systems 2015 (NIPS 2015), Montreal, QC, Canada.","DOI":"10.1016\/j.neunet.2014.09.004"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Pitelis, N., Russell, C., and Agapito, L. (2014). Semi-Supervised learning using an unsupervised atlas. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer.","DOI":"10.1007\/978-3-662-44851-9_36"},{"key":"ref_37","unstructured":"Kingma, D.P., Mohamed, S., Rezende, D.J., and Welling, M. (2014, January 8\u201311). Semi-Supervised learning with deep generative models. Proceedings of the Neural Information Processing Systems 2014 (NIPS 2014), Montreal, QC, Canada."},{"key":"ref_38","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/3\/334\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:07:00Z","timestamp":1760173620000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/3\/334"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,14]]},"references-count":38,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,3]]}},"alternative-id":["e22030334"],"URL":"https:\/\/doi.org\/10.3390\/e22030334","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,14]]}}}