{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:51:37Z","timestamp":1760233897071,"version":"build-2065373602"},"reference-count":50,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T00:00:00Z","timestamp":1614988800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["POCI-01-0145-FEDER-028857"],"award-info":[{"award-number":["POCI-01-0145-FEDER-028857"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In recent years, deep neural networks have shown significant progress in computer vision due to their large generalization capacity; however, the overfitting problem ubiquitously threatens the learning process of these highly nonlinear architectures. Dropout is a recent solution to mitigate overfitting that has witnessed significant success in various classification applications. Recently, many efforts have been made to improve the Standard dropout using an unsupervised merit-based semantic selection of neurons in the latent space. However, these studies do not consider the task-relevant information quality and quantity and the diversity of the latent kernels. To solve the challenge of dropping less informative neurons in deep learning, we propose an efficient end-to-end dropout algorithm that selects the most informative neurons with the highest correlation with the target output considering the sparsity in its selection procedure. First, to promote activation diversity, we devise an approach to select the most diverse set of neurons by making use of determinantal point process (DPP) sampling. Furthermore, to incorporate task specificity into deep latent features, a mutual information (MI)-based merit function is developed. Leveraging the proposed MI with DPP sampling, we introduce the novel DPPMI dropout that adaptively adjusts the retention rate of neurons based on their contribution to the neural network task. Empirical studies on real-world classification benchmarks including, MNIST, SVHN, CIFAR10, CIFAR100, demonstrate the superiority of our proposed method over recent state-of-the-art dropout algorithms in the literature.<\/jats:p>","DOI":"10.3390\/s21051846","type":"journal-article","created":{"date-parts":[[2021,3,7]],"date-time":"2021-03-07T21:52:15Z","timestamp":1615153935000},"page":"1846","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Maximum Relevance Minimum Redundancy Dropout with Informative Kernel Determinantal Point Process"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8336-8542","authenticated-orcid":false,"given":"Mohsen","family":"Saffari","sequence":"first","affiliation":[{"name":"INESC TEC and Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4683-7810","authenticated-orcid":false,"given":"Mahdi","family":"Khodayar","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Tulsa, Tulsa, OK 74104, USA"}]},{"given":"Mohammad Saeed","family":"Ebrahimi Saadabadi","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran 16315-1355, Iran"}]},{"given":"Ana F.","family":"Sequeira","sequence":"additional","affiliation":[{"name":"INESC TEC, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3760-2473","authenticated-orcid":false,"given":"Jaime S.","family":"Cardoso","sequence":"additional","affiliation":[{"name":"INESC TEC and Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2021,3,6]]},"reference":[{"key":"ref_1","unstructured":"Pereira, J.A., Sequeira, A.F., Pernes, D., and Cardoso, J.S. (2020, January 16\u201318). A robust fingerprint presentation attack detection method against unseen attacks through adversarial learning. Proceedings of the 2020 International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1625","DOI":"10.1109\/TII.2020.2971014","article-title":"Probabilistic Time-Varying Parameter Identification for Load Modeling: A Deep Generative Approach","volume":"17","author":"Khodayar","year":"2020","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Noormohammadi-Asl, A., Saffari, M., and Teshnehlab, M. (2018, January 8). Neural control of mobile robot motion based on feedback error learning and mimetic structure. Proceedings of the Iranian Conference on Electrical Engineering (ICEE), Mashhad, Iran.","DOI":"10.1109\/ICEE.2018.8472657"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"670","DOI":"10.1109\/TSTE.2018.2844102","article-title":"Spatio-temporal graph deep neural network for short-term wind speed forecasting","volume":"10","author":"Khodayar","year":"2018","journal-title":"IEEE Trans. Sustain. Energy"},{"key":"ref_5","first-page":"950","article-title":"A simple weight decay can improve generalization","volume":"4","author":"Moody","year":"1995","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_6","unstructured":"Cogswell, M., Ahmed, F., Girshick, R., Zitnick, L., and Batra, D. (2015). Reducing overfitting in deep networks by decorrelating representations. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1162\/NECO_a_00801","article-title":"Correlational neural networks","volume":"28","author":"Chandar","year":"2016","journal-title":"Neural Comput."},{"key":"ref_8","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the International Conference on Machine Learning (PMLR), Lille, France."},{"key":"ref_9","unstructured":"Yang, T., Zhu, S., and Chen, C. (2020). Gradaug: A new regularization method for deep neural networks. arXiv."},{"key":"ref_10","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Vivek, B., and Babu, R.V. (2020, January 14\u201319). Single-step adversarial training with dropout scheduling. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00103"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Mirzadeh, S.I., Farajtabar, M., and Ghasemzadeh, H. (2020, January 14\u201319). Dropout as an implicit gating mechanism for continual learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00124"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Kulesza, A., and Taskar, B. (2012). Determinantal point processes for machine learning. arXiv.","DOI":"10.1561\/9781601986290"},{"key":"ref_14","unstructured":"Ba, L. (2013). Adaptive Dropout for Training Deep Neural Networks. [Ph.D. Thesis, University of Toronto]."},{"key":"ref_15","unstructured":"Keshari, R., Singh, R., and Vatsa, M. (February, January 27). Guided dropout. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_16","unstructured":"Saito, K., Ushiku, Y., Harada, T., and Saenko, K. (2017). Adversarial dropout regularization. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Dodballapur, V., Calisa, R., Song, Y., and Cai, W. (2020, January 18\u201322). Automatic Dropout for Deep Neural Networks. Proceedings of the International Conference on Neural Information Processing, Bangkok, Thailand.","DOI":"10.1007\/978-3-030-63836-8_16"},{"key":"ref_18","unstructured":"Gomez, A.N., Zhang, I., Kamalakara, S.R., Madaan, D., Swersky, K., Gal, Y., and Hinton, G.E. (2019). Learning sparse networks using targeted dropout. arXiv."},{"key":"ref_19","unstructured":"Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., and Fergus, R. (2013, January 17\u201319). Regularization of neural networks using dropconnect. Proceedings of the International Conference on Machine Learning (PMLR), Scottsdale, AZ, USA."},{"key":"ref_20","unstructured":"Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013, January 17\u201319). Maxout networks. Proceedings of the International Conference on Machine Learning (PMLR), Scottsdale, AZ, USA."},{"key":"ref_21","unstructured":"Li, Z., Gong, B., and Yang, T. (2016). Improved dropout for shallow and deep learning. arXiv."},{"key":"ref_22","unstructured":"Ko, B., Kim, H.G., Oh, K.J., and Choi, H.J. (2017, January 13\u201316). Controlled dropout: A different approach to using dropout on deep neural network. Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), Jeju Island, Korea."},{"key":"ref_23","unstructured":"Gal, Y., Hron, J., and Kendall, A. (2017). Concrete dropout. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.neucom.2019.04.090","article-title":"Mutual information-based dropout: Learning deep relevant feature representation architectures","volume":"361","author":"Chen","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_25","unstructured":"DeVries, T., and Taylor, G.W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv."},{"key":"ref_26","unstructured":"Ghiasi, G., Lin, T.Y., and Le, Q.V. (2018). Dropblock: A regularization method for convolutional networks. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Huang, G., Sun, Y., Liu, Z., Sedra, D., and Weinberger, K.Q. (2016, January 8\u201316). Deep networks with stochastic depth. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46493-0_39"},{"key":"ref_28","unstructured":"Singh, S., Hoiem, D., and Forsyth, D. (2016). Swapout: Learning an ensemble of deep architectures. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1696","DOI":"10.1109\/TNNLS.2019.2921952","article-title":"Energy disaggregation via deep temporal dictionary learning","volume":"31","author":"Khodayar","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Salehinejad, H., and Valaee, S. (2019, January 12\u201317). Ising-dropout: A regularization method for training and compression of deep neural networks. Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK.","DOI":"10.1109\/ICASSP.2019.8682914"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2897","DOI":"10.1109\/TPAMI.2017.2784440","article-title":"Information dropout: Learning optimal representations through noisy computation","volume":"40","author":"Achille","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_32","unstructured":"Wang, S., and Manning, C. (2013, January 17\u201319). Fast dropout training. Proceedings of the International Conference on Machine Learning (PMLR), Scottsdale, AZ, USA."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"36140","DOI":"10.1109\/ACCESS.2019.2904881","article-title":"Beta-Dropout: A Unified Dropout","volume":"7","author":"Liu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_34","unstructured":"Krueger, D., Maharaj, T., Kram\u00e1r, J., Pezeshki, M., Ballas, N., Ke, N.R., Goyal, A., Bengio, Y., Courville, A., and Pal, C. (2016). Zoneout: Regularizing rnns by randomly preserving hidden activations. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"3235","DOI":"10.1109\/TII.2018.2809730","article-title":"Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE","volume":"14","author":"Yuan","year":"2018","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_37","unstructured":"Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and Ng, A.Y. (2021, March 02). Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Available online: https:\/\/www.researchgate.net\/publication\/266031774_Reading_Digits_in_Natural_Images_with_Unsupervised_Feature_Learning."},{"key":"ref_38","unstructured":"Krizhevsky, A., and Hinton, G. (2021, March 02). Learning Multiple Layers of Features from Tiny Images. Available online: https:\/\/scholar.google.com\/scholar?as_q=Learning+multiple+layers+of+features+from+tiny+images&as_occt=title&hl=en&as_sdt=0%2C31."},{"key":"ref_39","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_41","unstructured":"Glorot, X., and Bengio, Y. (2010, January 13\u201315). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (JMLR Workshop and Conference Proceedings), Sardinia, Italy."},{"key":"ref_42","unstructured":"Ver Steeg, G. (2021, March 02). Non-Parametric Entropy Estimation Toolbox (Npeet). Available online: https:\/\/www.isi.edu\/~gregv\/npeet.html."},{"key":"ref_43","first-page":"1","article-title":"DPPy: DPP Sampling with Python","volume":"20","author":"Gautier","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"ref_44","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Xie, J., Ma, Z., Zhang, G., Xue, J.H., Tan, Z.H., and Guo, J. (2020). Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization. arXiv.","DOI":"10.1109\/TPAMI.2021.3083089"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"106882","DOI":"10.1016\/j.tej.2020.106882","article-title":"Deep learning for pattern recognition of photovoltaic energy generation","volume":"34","author":"Khodayar","year":"2021","journal-title":"Electr. J."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Li, X., Chen, S., Hu, X., and Yang, J. (2019, January 16\u201320). Understanding the disharmony between dropout and batch normalization by variance shift. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00279"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"12777","DOI":"10.1007\/s11042-019-08453-9","article-title":"Dropout vs. batch normalization: An empirical study of their impact to deep learning","volume":"79","author":"Garbin","year":"2020","journal-title":"Multimed. Tools Appl."},{"key":"ref_49","unstructured":"Luo, P., Wang, X., Shao, W., and Peng, Z. (2018). Towards understanding regularization in batch normalization. arXiv."},{"key":"ref_50","unstructured":"Chen, G., Chen, P., Shi, Y., Hsieh, C.Y., Liao, B., and Zhang, S. (2019). Rethinking the usage of batch normalization and dropout in the training of deep neural networks. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/5\/1846\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:34:03Z","timestamp":1760160843000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/5\/1846"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,6]]},"references-count":50,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2021,3]]}},"alternative-id":["s21051846"],"URL":"https:\/\/doi.org\/10.3390\/s21051846","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,3,6]]}}}