{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T05:08:05Z","timestamp":1764133685274,"version":"build-2065373602"},"reference-count":38,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2023,11,30]],"date-time":"2023-11-30T00:00:00Z","timestamp":1701302400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"],"award-info":[{"award-number":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"]}]},{"name":"National Natural Science Foundation of China","award":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"],"award-info":[{"award-number":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"]}]},{"name":"Research Project of Guizhou University for Talent Introduction","award":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"],"award-info":[{"award-number":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"]}]},{"name":"Cultivation Project of Guizhou University","award":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"],"award-info":[{"award-number":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"]}]},{"name":"Open Fund of Key Laboratory of Advanced Manufacturing Technology, Ministry of Education","award":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"],"award-info":[{"award-number":["2022YFB2701400","62272124","62361010","[2020]61","[2019]56","GZUAMT2021KF[01]"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Deep learning is one of the most exciting and promising techniques in the field of artificial intelligence (AI), which drives AI applications to be more intelligent and comprehensive. However, existing deep learning techniques usually require a large amount of expensive labeled data, which limit the application and development of deep learning techniques, and thus it is imperative to study unsupervised machine learning. The learning of deep representations by mutual information estimation and maximization (Deep InfoMax or DIM) method has achieved unprecedented results in the field of unsupervised learning. However, in the DIM method, to restrict the encoder to learn more normalized feature representations, an adversarial network learning method is used to make the encoder output consistent with a priori positively distributed data. As we know, the model training of the adversarial network learning method is difficult to converge, because there is a logarithmic function in the loss function of the cross-entropy measure, and the gradient of the model parameters is susceptible to the \u201cgradient explosion\u201d or \u201cgradient disappearance\u201d phenomena, which makes the training of the DIM method extremely unstable. In this regard, we propose a Wasserstein distance-based DIM method to solve the stability problem of model training, and our method is called the WDIM. Subsequently, the training stability of the WDIM method and the classification ability of unsupervised learning are verified on the CIFAR10, CIFAR100, and STL10 datasets. The experiments show that our proposed WDIM method is more stable to parameter updates, has faster model convergence, and at the same time, has almost the same accuracy as the DIM method on the classification task of unsupervised learning. Finally, we also propose a reflection of future research for the WDIM method, aiming to provide a research idea and direction for solving the image classification task with unsupervised learning.<\/jats:p>","DOI":"10.3390\/e25121607","type":"journal-article","created":{"date-parts":[[2023,11,30]],"date-time":"2023-11-30T07:44:42Z","timestamp":1701330282000},"page":"1607","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Stable and Fast Deep Mutual Information Maximization Based on Wasserstein Distance"],"prefix":"10.3390","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-9650-0786","authenticated-orcid":false,"given":"Xing","family":"He","sequence":"first","affiliation":[{"name":"State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China"},{"name":"Guizhou Key Laboratory of Pattern Recognition and Intelligent System, Guizhou Minzu University, Guiyang 550025, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8733-4596","authenticated-orcid":false,"given":"Changgen","family":"Peng","sequence":"additional","affiliation":[{"name":"Guizhou Big Data Academy, Guizhou University, Guiyang 550025, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lin","family":"Wang","sequence":"additional","affiliation":[{"name":"Guizhou Key Laboratory of Pattern Recognition and Intelligent System, Guizhou Minzu University, Guiyang 550025, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6590-5757","authenticated-orcid":false,"given":"Weijie","family":"Tan","sequence":"additional","affiliation":[{"name":"Guizhou Big Data Academy, Guizhou University, Guiyang 550025, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zifan","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Guizhou Aerospace Measuring and Testing Technology, Guiyang 550025, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1109\/NNSP.2000.889421","article-title":"A fast, on-line algorithm for PCA and its convergence characteristics","volume":"Volume 1","author":"Rao","year":"2000","journal-title":"Neural Networks for Signal Processing X, Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No. 00TH8501), Sydney, NSW, Australia, 11\u201313 December 2000"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2319","DOI":"10.1126\/science.290.5500.2319","article-title":"A global geometric framework for nonlinear dimensionality reduction","volume":"290","author":"Tenenbaum","year":"2000","journal-title":"Science"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2323","DOI":"10.1126\/science.290.5500.2323","article-title":"Nonlinear dimensionality reduction by locally linear embedding","volume":"290","author":"Roweis","year":"2000","journal-title":"Science"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1373","DOI":"10.1162\/089976603321780317","article-title":"Laplacian eigenmaps for dimensionality reduction and data representation","volume":"15","author":"Belkin","year":"2003","journal-title":"Neural Comput."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"5591","DOI":"10.1073\/pnas.1031596100","article-title":"Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data","volume":"100","author":"Donoho","year":"2003","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1137\/S1064827502419154","article-title":"Principal manifolds and nonlinear dimensionality reduction via tangent space alignment","volume":"26","author":"Zhang","year":"2004","journal-title":"SIAM J. Sci. Comput."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/3422622","article-title":"Generative adversarial networks","volume":"63","author":"Goodfellow","year":"2020","journal-title":"Commun. ACM"},{"key":"ref_8","unstructured":"Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., and Chen, X. (2016). Improved techniques for training gans. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"41:1","DOI":"10.1145\/3404890","article-title":"Blockchain-enabled Tensor-based Conditional Deep Convolutional GAN for Cyber-physical-Social Systems","volume":"21","author":"Feng","year":"2021","journal-title":"ACM Trans. Internet Technol."},{"key":"ref_10","unstructured":"Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Bachman, P., Trischler, A., and Bengio, Y. (2018). Learning deep representations by mutual information estimation and maximization. arXiv."},{"key":"ref_11","unstructured":"Belghazi, M.I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A., and Hjelm, D. (2018, January 10\u201315). Mutual information neural estimation. Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden."},{"key":"ref_12","first-page":"9912","article-title":"Unsupervised learning of visual features by contrasting cluster assignments","volume":"33","author":"Caron","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_13","unstructured":"Poole, B., Ozair, S., Van Den Oord, A., Alemi, A., and Tucker, G. (2019, January 9\u201315). On variational bounds of mutual information. Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. (2020, January 13\u201319). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"ref_15","unstructured":"Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020, January 13\u201318). A simple framework for contrastive learning of visual representations. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_16","unstructured":"Chen, T., Kornblith, S., Swersky, K., Norouzi, M., and Hinton, G.E. (2020, January 6\u201312). Big Self-Supervised Models are Strong Semi-Supervised Learners. Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, Virtual."},{"key":"ref_17","unstructured":"Han, T., Xie, W., and Zisserman, A. (2020, January 6\u201312). Self-supervised Co-Training for Video Representation Learning. Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, Virtual."},{"key":"ref_18","unstructured":"van den Oord, A., Li, Y., and Vinyals, O. (2018). Representation Learning with Contrastive Predictive Coding. arXiv."},{"key":"ref_19","unstructured":"Higgins, I., Matthey, L., Pal, A., Burgess, C.P., Glorot, X., Botvinick, M.M., Mohamed, S., and Lerchner, A. (2017, January 24\u201326). beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhang, R., Isola, P., and Efros, A.A. (2017, January 21\u201326). Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.76"},{"key":"ref_21","unstructured":"Bachman, P., Hjelm, R.D., and Buchwalter, W. (2019, January 8\u201314). Learning Representations by Maximizing Mutual Information Across Views. Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada."},{"key":"ref_22","unstructured":"Asano, Y.M., Rupprecht, C., and Vedaldi, A. (2019). Self-labelling via simultaneous clustering and representation learning. arXiv."},{"key":"ref_23","unstructured":"Tschannen, M., Djolonga, J., Rubenstein, P.K., Gelly, S., and Lucic, M. (2019). On mutual information maximization for representation learning. arXiv."},{"key":"ref_24","unstructured":"Chen, X., Fan, H., Girshick, R.B., and He, K. (2020). Improved Baselines with Momentum Contrastive Learning. arXiv."},{"key":"ref_25","first-page":"139","article-title":"Deep Clustering for Unsupervised Learning of Visual Features","volume":"Volume 11218","author":"Ferrari","year":"2018","journal-title":"Lecture Notes in Computer Science, Proceedings of the Computer Vision\u2014ECCV 2018\u201415th European Conference, Munich, Germany, 8\u201314 September 2018"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zhuang, C., Zhai, A.L., and Yamins, D. (November, January 27). Local Aggregation for Unsupervised Learning of Visual Embeddings. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00610"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Shen, Z., Liu, Z., Liu, Z., Savvides, M., Darrell, T., and Xing, E.P. (March, January 22). Un-mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning. Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022, Virtual Event.","DOI":"10.1609\/aaai.v36i2.20119"},{"key":"ref_28","unstructured":"Gidaris, S., Singh, P., and Komodakis, N. (May, January 30). Unsupervised Representation Learning by Predicting Image Rotations. Proceedings of the 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wu, Z., Xiong, Y., Yu, S.X., and Lin, D. (2018, January 18\u201322). Unsupervised Feature Learning via Non-Parametric Instance Discrimination. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00393"},{"key":"ref_30","unstructured":"Alwassel, H., Mahajan, D., Korbar, B., Torresani, L., Ghanem, B., and Tran, D. (2020, January 6\u201312). Self-Supervised Learning by Cross-Modal Audio-Video Clustering. Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, Virtual."},{"key":"ref_31","unstructured":"Zbontar, J., Jing, L., Misra, I., LeCun, Y., and Deny, S. (2021, January 18\u201324). Barlow Twins: Self-Supervised Learning via Redundancy Reduction. Proceedings of the 38th International Conference on Machine Learning, ICML 2021, Virtual Event."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Feng, J., Yang, L.T., Ren, B., Zou, D., Dong, M., and Zhang, S. (2023). Tensor Recurrent Neural Network with Differential Privacy. IEEE Trans. Comput., 1\u201311.","DOI":"10.1109\/TC.2023.3236868"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"105490","DOI":"10.1016\/j.knosys.2020.105490","article-title":"Zero-shot learning by mutual information estimation and maximization","volume":"194","author":"Tang","year":"2020","journal-title":"Knowl. Based Syst."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"5510329","DOI":"10.1155\/2023\/5510329","article-title":"Fast and Accurate Deep Leakage from Gradients Based on Wasserstein Distance","volume":"2023","author":"He","year":"2023","journal-title":"Int. J. Intell. Syst."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/cpa.3160280102","article-title":"Asymptotic evaluation of certain markov process expectations for large time, I","volume":"28","author":"Donsker","year":"1975","journal-title":"Commun. Pure Appl. Math."},{"key":"ref_36","unstructured":"Kingma, D.P., and Welling, M. (2014, January 14\u201316). Auto-Encoding Variational Bayes. Proceedings of the 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada. Conference Track Proceedings."},{"key":"ref_37","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017, January 6\u201311). Wasserstein Generative Adversarial Networks. Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia."},{"key":"ref_38","unstructured":"Cuturi, M. (2013, January 5\u20138). Sinkhorn Distances: Lightspeed Computation of Optimal Transport. Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, NE, USA."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/12\/1607\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:34:41Z","timestamp":1760132081000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/12\/1607"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,30]]},"references-count":38,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["e25121607"],"URL":"https:\/\/doi.org\/10.3390\/e25121607","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2023,11,30]]}}}