{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:04:34Z","timestamp":1760241874123,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2018,10,26]],"date-time":"2018-10-26T00:00:00Z","timestamp":1540512000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL\u2019s wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input\/output with the hidden layers and basic units can facilitate a better understanding of these DL units\u2019 roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance, and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.<\/jats:p>","DOI":"10.3390\/e20110823","type":"journal-article","created":{"date-parts":[[2018,10,26]],"date-time":"2018-10-26T03:16:16Z","timestamp":1540523776000},"page":"823","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Dissecting Deep Learning Networks\u2014Visualizing Mutual Information"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9365-7420","authenticated-orcid":false,"given":"Hui","family":"Fang","sequence":"first","affiliation":[{"name":"Computer Science Department, Liverpool John Moores University, Liverpool L3 3AF, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Victoria","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute for Criminal Justice Studies, University of Portsmouth, Portsmouth PO1 2HY, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Motonori","family":"Yamaguchi","sequence":"additional","affiliation":[{"name":"Department of Psychology, Edge Hill University, Ormskirk L39 4QP, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2018,10,26]]},"reference":[{"key":"ref_1","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet classification with deep convolutional neural networks. Proceedings of the Twenty-sixth Conference on Neural Information Processing Systems (NIPS 2012), Lake Tahoe, NV, USA."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Yu, D., Seide, F., and Li, G. (July, January 26). Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. Proceedings of the 29th International Conference on Machine Learning (ICML 2012), Edinburgh, UK.","DOI":"10.21437\/Interspeech.2011-169"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1109\/MIS.2015.69","article-title":"Deep neural networks in machine translation: An overview","volume":"30","author":"Zhang","year":"2015","journal-title":"IEEE Intell. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1038\/nature16961","article-title":"Mastering the game of Go with deep neural networks and tree search","volume":"529","author":"Silver","year":"2016","journal-title":"Nature"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Fang, H., Thiyagalingam, J., Bessis, N., and Edirisinghe, E. (2017, January 17\u201320). Fast and reliable human action recognition in video sequences by sequential analysis. Proceedings of the International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8297028"},{"key":"ref_8","unstructured":"Shwartz-Ziv, R., and Tishby, N. (arXiv, 2017). Opening the black box of deep neural networks via information, arXiv."},{"key":"ref_9","unstructured":"Saxe, A., Bansal, Y., Dapello, J., Advani, M., and Kolchinsky, A. (May, January 30). On the Information bottleneck theory of deep learning. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_10","unstructured":"Vasudevan, S. (arXiv, 2018). Dynamic learning rate using Mutual Information, arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hodas, N.O., and Stinis, P. (arXiv, 2018). Doing the impossible: Why neural networks can be trained at all, arXiv.","DOI":"10.3389\/fpsyg.2018.01185"},{"key":"ref_12","unstructured":"Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Trischler, A., and Bengio, Y. (arXiv, 2018). Learning deep representations by mutual information estimation and maximization, arXiv."},{"key":"ref_13","first-page":"318","article-title":"Learning internal representations by error propagation","volume":"Volume 1","author":"Rumelhart","year":"1986","journal-title":"Parallel Distributed Processing"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/j.patcog.2017.10.013","article-title":"Recent advances in convolutional neural networks","volume":"77","author":"Gu","year":"2018","journal-title":"Pattern Recognit."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Weinberger, K.Q., and van der Maaten, L. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). Facenet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_17","unstructured":"Simonyan, K., and Zisserman, A. (2014, January 8\u201313). Two-stream convolutional networks for action recognition in videos. Proceedings of the Twenty-eighth Conference on Neural Information Processing Systems (NIPS 2014), Montreal, Canada."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1145\/2897824.2925974","article-title":"Let there be color! Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification","volume":"35","author":"Iizuka","year":"2016","journal-title":"ACM Trans. Graph."},{"key":"ref_19","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative adversarial nets. Proceedings of the Twenty-eighth Conference on Neural Information Processing Systems (NIPS 2014), Montreal, QC, Canada."},{"key":"ref_20","unstructured":"Radford, A., Metz, L., and Chintala, S. (arXiv, 2015). Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv."},{"key":"ref_21","unstructured":"Srivastava, R.K., Greff, K., and Schmidhuber, J. (2015, January 7\u201312). Training very deep networks. Proceedings of the Twenty-ninth Conference on Neural Information Processing Systems (NIPS 2015), Montreal, Canada."},{"key":"ref_22","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1109\/TMI.2003.815867","article-title":"Mutual-information-based registration of medical images: A survey","volume":"22","author":"Pluim","year":"2003","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_24","unstructured":"Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., and Abbeel, P. (2016, January 5\u201310). Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Proceedings of the Twenty-ninth Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1206","DOI":"10.1109\/TVCG.2010.132","article-title":"An information-theoretic framework for visualization","volume":"16","author":"Chen","year":"2010","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1216","DOI":"10.1109\/TVCG.2010.131","article-title":"An information-theoretic framework for flow visualization","volume":"16","author":"Xu","year":"2010","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"254","DOI":"10.3390\/e13010254","article-title":"Information theory in scientific visualization","volume":"13","author":"Wang","year":"2011","journal-title":"Entropy"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Alsakran, J., Huang, X., Zhao, Y., Yang, J., and Fast, K. (2014, January 4\u20137). Using entropy-related measures in categorical data visualization. Proceedings of the IEEE Pacific Visualization Symposium (PacificVis), Yokohama, Japan.","DOI":"10.1109\/PacificVis.2014.43"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zeiler, M.D., and Fergus, R. (2014). Visualizing and understanding convolutional networks. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Bau, D., Zhou, B., Khosla, A., Oliva, A., and Torralba, A. (2017, January 21\u201326). Network dissection: Quantifying interpretability of deep visual representations. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.354"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1109\/TVCG.2016.2598831","article-title":"Towards better analysis of deep convolutional neural networks","volume":"23","author":"Liu","year":"2017","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1415","DOI":"10.1109\/5.58323","article-title":"30 years of adaptive neural networks: Perceptron, madaline, and backpropagation","volume":"78","author":"Widrow","year":"1990","journal-title":"Proc. IEEE"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_34","unstructured":"Kingma, D.P., and Ba, J. (arXiv, 2014). Adam: A method for stochastic optimization, arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"5629","DOI":"10.1109\/TIT.2018.2807481","article-title":"Demystifying fixed k-nearest neighbor information estimators","volume":"8","author":"Gao","year":"2018","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Kolchinsky, A., and Tracey, B. (2017). Estimating mixture entropy with pairwise distance. Entropy, 19.","DOI":"10.3390\/e19070361"},{"key":"ref_37","unstructured":"Tishby, N., Pereira, F.C., and Bialek, W. (1999, January 22\u201324). The information bottleneck method. Proceedings of the 37-th Annual Allerton Conference on Communication, Control and Computing, Monticello, IL, USA."},{"key":"ref_38","unstructured":"Quinlan, J.R. (1996, January 4\u20138). Bagging, boosting, and C4.5. Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, Portland, OR, USA."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"901","DOI":"10.1111\/j.1467-8659.2011.01939.x","article-title":"Visualization of time-series data in parameter space for understanding facial dynamics","volume":"30","author":"Tam","year":"2011","journal-title":"Comput. Graph. Forum"},{"key":"ref_40","unstructured":"Saraiya, P., North, C., and Duca, K. (2004, January 10\u201312). An evaluation of microarray visualization tools for biological insight. Proceedings of the IEEE Symposium on Information Visualization, Austin, TX, USA."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1228","DOI":"10.1109\/TVCG.2012.312","article-title":"Visualizing natural image statistics","volume":"19","author":"Fang","year":"2013","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"566","DOI":"10.1093\/nar\/gkv468","article-title":"ClustVis: A web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap","volume":"43","author":"Metsalu","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"ref_43","unstructured":"Xiao, H., Rasul, K., and Vollgraf, R. (arXiv, 2017). Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms, arXiv."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/11\/823\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:26:20Z","timestamp":1760196380000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/11\/823"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,10,26]]},"references-count":43,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2018,11]]}},"alternative-id":["e20110823"],"URL":"https:\/\/doi.org\/10.3390\/e20110823","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2018,10,26]]}}}