{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,18]],"date-time":"2026-01-18T08:30:50Z","timestamp":1768725050927,"version":"3.49.0"},"reference-count":36,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2019,5,1]],"date-time":"2019-05-01T00:00:00Z","timestamp":1556668800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61601288"],"award-info":[{"award-number":["61601288"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61502304"],"award-info":[{"award-number":["61502304"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Inspired by the pioneering work of the information bottleneck (IB) principle for Deep Neural Networks\u2019 (DNNs) analysis, we thoroughly study the relationship among the model accuracy,     I ( X ; T )     and     I ( T ; Y )    , where     I ( X ; T )     and     I ( T ; Y )     are the mutual information of DNN\u2019s output T with input X and label Y. Then, we design an information plane-based framework to evaluate the capability of DNNs (including CNNs) for image classification. Instead of each hidden layer\u2019s output, our framework focuses on the model output T. We successfully apply our framework to many application scenarios arising in deep learning and image classification problems, such as image classification with unbalanced data distribution, model selection, and transfer learning. The experimental results verify the effectiveness of the information plane-based framework: Our framework may facilitate a quick model selection and determine the number of samples needed for each class in the unbalanced classification problem. Furthermore, the framework explains the efficiency of transfer learning in the deep learning area.<\/jats:p>","DOI":"10.3390\/e21050456","type":"journal-article","created":{"date-parts":[[2019,5,2]],"date-time":"2019-05-02T03:15:22Z","timestamp":1556766922000},"page":"456","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8864-7818","authenticated-orcid":false,"given":"Hao","family":"Cheng","sequence":"first","affiliation":[{"name":"Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"},{"name":"School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4947-0316","authenticated-orcid":false,"given":"Dongze","family":"Lian","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China"}]},{"given":"Shenghua","family":"Gao","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4451-7242","authenticated-orcid":false,"given":"Yanlin","family":"Geng","sequence":"additional","affiliation":[{"name":"State Key Laboratory of ISN, Xidian University, Xi\u2019an 710071, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,5,1]]},"reference":[{"key":"ref_1","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20138). ImageNet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Seide, F., Li, G., and Yu, D. (2011, January 27\u201331). Conversational speech transcription using context-dependent deep neural networks. Proceedings of the Twelfth Annual Conference of the International Speech Communication Association, Florence, Italy.","DOI":"10.21437\/Interspeech.2011-169"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1109\/MIS.2015.69","article-title":"Deep Neural Networks in Machine Translation: an Overview","volume":"30","author":"Zhang","year":"2015","journal-title":"IEEE Intell. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1038\/nature16961","article-title":"Mastering the game of Go with deep neural networks and tree search","volume":"529","author":"Silver","year":"2016","journal-title":"Nature"},{"key":"ref_7","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). Facenet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_10","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2014). Object detectors emerge in deep scene cnns. arXiv."},{"key":"ref_11","unstructured":"Lu, Y. (2015). Unsupervised learning on neural network outputs: With application in zero-shot learning. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Aubry, M., and Russell, B.C. (2015, January 5\u20138). Understanding deep features with computer-generated imagery. Proceedings of the IEEE International Conference on Computer Vision, Tampa, FL, USA.","DOI":"10.1109\/ICCV.2015.329"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zeiler, M.D., and Fergus, R. (2014, January 6\u201312). Visualizing and understanding convolutional networks. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bau, D., Zhou, B., Khosla, A., Oliva, A., and Torralba, A. (2017, January 21\u201326). Network dissection: Quantifying interpretability of deep visual representations. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.354"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Cheng, H., Lian, D., Gao, S., and Geng, Y. (2018, January 8\u201314). Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01252-6_11"},{"key":"ref_16","unstructured":"Vasudevan, S. (2018). Dynamic learning rate using Mutual Information. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1109\/72.298224","article-title":"Using mutual information for selecting features in supervised neural net learning","volume":"5","author":"Battiti","year":"1994","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_18","unstructured":"Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Trischler, A., and Bengio, Y. (2018). Learning deep representations by mutual information estimation and maximization. arXiv."},{"key":"ref_19","unstructured":"Tishby, N., Pereira, F.C., and Bialek, W. (2000). The information bottleneck method. arXiv."},{"key":"ref_20","first-page":"165","article-title":"Information bottleneck for Gaussian variables","volume":"6","author":"Chechik","year":"2005","journal-title":"J. Mach. Learn. Res."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1162\/NECO_a_00961","article-title":"The deterministic information bottleneck","volume":"29","author":"Strouse","year":"2017","journal-title":"Neural Comput."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2696","DOI":"10.1016\/j.tcs.2010.04.006","article-title":"Learning and generalization with the information bottleneck","volume":"411","author":"Shamir","year":"2010","journal-title":"Theor. Comput. Sci."},{"key":"ref_23","unstructured":"Alemi, A.A., Fischer, I., Dillon, J.V., and Murphy, K. (2016). Deep variational information bottleneck. arXiv."},{"key":"ref_24","unstructured":"Thann, T., and Nguyen, J.C. (2018). Layer-wise Learning of Stochastic Neural Networks with Information Bottleneck. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2897","DOI":"10.1109\/TPAMI.2017.2784440","article-title":"Information dropout: Learning optimal representations through noisy computation","volume":"40","author":"Achille","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_26","unstructured":"Shwartz-Ziv, R., and Tishby, N. (2017). Opening the Black Box of Deep Neural Networks via Information. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"He, K., Ross, G., and Dollar, P. (2018). Rethinking ImageNet Pre-training. arXiv.","DOI":"10.1109\/ICCV.2019.00502"},{"key":"ref_28","unstructured":"Keskar, N.S., and Socher, R. (2017). Improving Generalization Performance by Switching from Adam to SGD. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Raginsky, M., Rakhlin, A., Tsao, M., Wu, Y., and Xu, A. (2016, January 11\u201314). Information-theoretic analysis of stability and bias of learning algorithms. Proceedings of the 2016 IEEE Information Theory Workshop (ITW), Cambridge, UK.","DOI":"10.1109\/ITW.2016.7606789"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Weinberger, K.Q., and van der Maaten, L. (2016). Densely connected convolutional networks. arXiv.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_31","unstructured":"Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/S0893-6080(98)00116-6","article-title":"On the momentum term in gradient descent learning algorithms","volume":"12","author":"Qian","year":"1999","journal-title":"Neural Netw."},{"key":"ref_33","first-page":"2121","article-title":"Adaptive subgradient methods for online learning and stochastic optimization","volume":"12","author":"Duchi","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_34","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"5629","DOI":"10.1109\/TIT.2018.2807481","article-title":"Demystifying Fixed k-Nearest Neighbor Information Estimators","volume":"64","author":"Gao","year":"2018","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Kolchinsky, A., and Tracey, B. (2017). Estimating mixture entropy with pairwise distances. Entropy, 19.","DOI":"10.3390\/e19070361"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/5\/456\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:48:38Z","timestamp":1760186918000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/5\/456"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,1]]},"references-count":36,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2019,5]]}},"alternative-id":["e21050456"],"URL":"https:\/\/doi.org\/10.3390\/e21050456","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,1]]}}}