{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T03:32:01Z","timestamp":1768879921881,"version":"3.49.0"},"reference-count":21,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T00:00:00Z","timestamp":1768780800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006505","name":"U.S. Army Engineer Research and Development Center","doi-asserted-by":"publisher","award":["W912HZ-24-2-0056"],"award-info":[{"award-number":["W912HZ-24-2-0056"]}],"id":[{"id":"10.13039\/100006505","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>We study how convolutional neural networks reorganize information during learning in natural image classification tasks by tracking mutual information (MI) between inputs, intermediate representations, and labels. Across VGG-16, ResNet-18, and ResNet-50, we find that label-relevant MI grows reliably with depth while input MI depends strongly on architecture and activation, indicating that \u201ccompression\u2019\u2019 is not a universal phenomenon. Within convolutional layers, label information becomes increasingly concentrated in a small subset of channels; inference-time knockouts, shuffles, and perturbations confirm that these high-MI channels are functionally necessary for accuracy. This behavior suggests a view of representation learning driven by selective concentration and decorrelation rather than global information reduction. Finally, we show that a simple dependence-aware regularizer based on the Hilbert\u2013Schmidt Independence Criterion can encourage these same patterns during training, yielding small accuracy gains and consistently faster convergence.<\/jats:p>","DOI":"10.3390\/e28010118","type":"journal-article","created":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T14:58:54Z","timestamp":1768834734000},"page":"118","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Uncovering Neural Learning Dynamics Through Latent Mutual Information"],"prefix":"10.3390","volume":"28","author":[{"given":"Arianna","family":"Issitt","sequence":"first","affiliation":[{"name":"NEural TransmissionS (NETS) Lab, Florida Institute of Technology, Melbourne, FL 32901, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alex","family":"Merino","sequence":"additional","affiliation":[{"name":"NEural TransmissionS (NETS) Lab, Florida Institute of Technology, Melbourne, FL 32901, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lamine","family":"Deen","sequence":"additional","affiliation":[{"name":"NEural TransmissionS (NETS) Lab, Florida Institute of Technology, Melbourne, FL 32901, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5524-629X","authenticated-orcid":false,"given":"Ryan T.","family":"White","sequence":"additional","affiliation":[{"name":"NEural TransmissionS (NETS) Lab, Florida Institute of Technology, Melbourne, FL 32901, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0538-3656","authenticated-orcid":false,"given":"Mackenzie J.","family":"Meni","sequence":"additional","affiliation":[{"name":"NEural TransmissionS (NETS) Lab, Florida Institute of Technology, Melbourne, FL 32901, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2026,1,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"066138","DOI":"10.1103\/PhysRevE.69.066138","article-title":"Estimating mutual information","volume":"69","author":"Kraskov","year":"2004","journal-title":"Phys. Rev. E"},{"key":"ref_2","unstructured":"Belghazi, M.I., Baratin, A., Rajeswar, S., Ozair, S., Bengio, Y., Courville, A., and Hjelm, R.D. (2021). MINE: Mutual Information Neural Estimation. arXiv."},{"key":"ref_3","unstructured":"van den Oord, A., Li, Y., and Vinyals, O. (2018). Representation Learning with Contrastive Predictive Coding. arXiv."},{"key":"ref_4","unstructured":"Touretzky, D. (1988). An Application of the Principle of Maximum Information Preservation to Linear Systems. Advances in Neural Information Processing Systems, Morgan-Kaufmann."},{"key":"ref_5","unstructured":"Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Bachman, P., Trischler, A., and Bengio, Y. (2019). Learning deep representations by mutual information estimation and maximization. arXiv."},{"key":"ref_6","unstructured":"Tishby, N., Pereira, F.C., and Bialek, W. (2000). The information bottleneck method. arXiv."},{"key":"ref_7","unstructured":"Shwartz-Ziv, R., and Tishby, N. (2017). Opening the Black Box of Deep Neural Networks via Information. arXiv."},{"key":"ref_8","unstructured":"Saxe, A.M., Bansal, Y., Dapello, J., Advani, M., Kolchinsky, A., Tracey, B.D., and Cox, D.D. (May, January 30). On the Information Bottleneck Theory of Deep Learning. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_9","unstructured":"Schneider, J., and Prabhushankar, M. (2023). Understanding and Leveraging the Learning Phases of Neural Networks. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"121239","DOI":"10.1016\/j.ins.2024.121239","article-title":"Entropy-based guidance of deep neural networks for accelerated convergence and improved performance","volume":"681","author":"Meni","year":"2024","journal-title":"Inf. Sci."},{"key":"ref_11","unstructured":"Meni, M. (2024). Decoding Neural Networks: An Information-Theoretic Guide to Interpretability, Error Analysis and Efficiency. [Ph.D. Dissertation, Florida Institute of Technology]."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"298","DOI":"10.1016\/j.ins.2021.04.066","article-title":"Stochastic mutual information gradient estimation for dimensionality reduction networks","volume":"570","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_13","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 21\u201326). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_15","unstructured":"Howard, J. (2025, October 17). Imagenette. Available online: https:\/\/github.com\/fastai\/imagenette."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Gretton, A., Bousquet, O., Smola, A., and Sch\u00f6lkopf, B. (2005). Measuring Statistical Dependence with Hilbert-Schmidt Norms. Algorithmic Learning Theory (ALT), Springer.","DOI":"10.1007\/11564089_7"},{"key":"ref_17","unstructured":"Gretton, A., Fukumizu, K., Teo, C.H., Song, L., Sch\u00f6lkopf, B., and Smola, A.J. (2008, January 8\u201311). A Kernel Statistical Test of Independence. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Meni, M., Mahendrakar, T., Raney, O.D., White, R.T., Mayo, M.L., and Pilkiewicz, K.R. (2024). Taking a PEEK into YOLOv5 for Satellite Component Recognition via Entropy-based Visual Explanations. AIAA SCITECH 2024 Forum, American Institute of Aeronautics and Astronautics.","DOI":"10.2514\/6.2024-2766"},{"key":"ref_19","first-page":"296","article-title":"Probabilistic Explanations for Entropic Knowledge Extraction for Automated Satellite Component Detection","volume":"22","author":"Meni","year":"2025","journal-title":"J. Aerosp. Inf. Syst."},{"key":"ref_20","unstructured":"Meni, M.J., Gisclair, B., White, R.T., and Mahendrakar, T. (2025, January 10\u201313). PEEK-Guided Neural Network Pruning for Deployment on Low SWaP Hardware. Proceedings of the 39th Annual Small Satellite Conference (SmallSat 2025), Utah State University, Logan, UT, USA."},{"key":"ref_21","unstructured":"Meni, M.J., Gisclair, B., Niwas, M., and White, R.T. (PEEK Variance: An Information-Theoretic Metric Unifying Interpretability, Optimization, and Efficiency in Deep Neural Networks, 2026). PEEK Variance: An Information-Theoretic Metric Unifying Interpretability, Optimization, and Efficiency in Deep Neural Networks, in peer review."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/28\/1\/118\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T15:02:25Z","timestamp":1768834945000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/28\/1\/118"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,19]]},"references-count":21,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,1]]}},"alternative-id":["e28010118"],"URL":"https:\/\/doi.org\/10.3390\/e28010118","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,19]]}}}