{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:21:51Z","timestamp":1760235711567,"version":"build-2065373602"},"reference-count":39,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2021,9,24]],"date-time":"2021-09-24T00:00:00Z","timestamp":1632441600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Recently, deep convolutional neural networks (CNN) with inception modules have attracted much attention due to their excellent performances on diverse domains. Nevertheless, the basic CNN can only capture a univariate feature, which is essentially linear. It leads to a weak ability in feature expression, further resulting in insufficient feature mining. In view of this issue, researchers incessantly deepened the network, bringing parameter redundancy and model over-fitting. Hence, whether we can employ this efficient deep neural network architecture to improve CNN and enhance the capacity of image recognition task still remains unknown. In this paper, we introduce spike-and-slab units to the modified inception module, enabling our model to capture dual latent variables and the average and covariance information. This operation further enhances the robustness of our model to variations of image intensity without increasing the model parameters. The results of several tasks demonstrated that dual variable operations can be well-integrated into inception modules, and excellent results have been achieved.<\/jats:p>","DOI":"10.3390\/s21196382","type":"journal-article","created":{"date-parts":[[2021,9,27]],"date-time":"2021-09-27T22:16:38Z","timestamp":1632780998000},"page":"6382","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Learning Hierarchical Representations with Spike-and-Slab Inception Network"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6770-4497","authenticated-orcid":false,"given":"Weizheng","family":"Qiao","sequence":"first","affiliation":[{"name":"College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China"}]},{"given":"Xiaojun","family":"Bi","sequence":"additional","affiliation":[{"name":"College of Information Engineering, Minzu University of China, Beijing 100091, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Al-Qizwini, M., Barjasteh, I., Al-Qassab, H., and Radha, H. (2017, January 11\u201314). Deep learning algorithm for autonomous driving using googlenet. Proceedings of the Intelligent Vehicles Symposium, Redondo Beach, CA, USA.","DOI":"10.1109\/IVS.2017.7995703"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ballester, P., and de Araujo, R.M. (2016, January 12\u201317). On the performance of googlenet and alexnet applied to sketches. Proceedings of the AAAI, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10171"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1016\/j.neucom.2016.11.023","article-title":"G-ms2f: Googlenet based multi-stage feature fusion of deep cnn for scene recognition","volume":"225","author":"Tang","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioe, S., Vanhoucke, V., and Alemi, A.A. (2017, January 4\u20139). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_7","unstructured":"Kingma, D.P., Mohamed, S., Rezende, D.J., and Welling, M. (2014, January 8\u201311). Semi-supervised learning with deep generative models. Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_8","unstructured":"Srivastava, N., and Salakhutdinov, R.R. (2012, January 3\u20138). Multimodal learning with deep boltzmann machines. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Yeh, R.A., Chen, C., Lim, T.Y., Schwing, A.G., Hasegawa-Johnson, M., and Do, M.N. (2017, January 22\u201325). Semantic image inpainting with deep generative models. Proceedings of the CVPR, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.728"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1049\/ip-vis:20030362","article-title":"Continuous restricted Boltzmann machine with an implementable training algorithm","volume":"150","author":"Chen","year":"2003","journal-title":"IEEE Proc. Vis. Image Signal Process."},{"key":"ref_11","first-page":"233","article-title":"A Spike and Slab Restricted Boltzmann Machine","volume":"15","author":"Courville","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","unstructured":"Courville, A.C., Bergstra, J., and Bengio, Y. (July, January 28). Unsupervised Models of Images by Spike-and-Slab RBMs. Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Bellevue, WA, USA."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1874","DOI":"10.1109\/TPAMI.2013.238","article-title":"The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions","volume":"36","author":"Courville","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","unstructured":"LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., and Jackel, L.D. (1990, January 2\u20135). Handwritten digit recognition with a backpropagation network. Proceedings of the 2nd International Conference on Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_15","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_17","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201323). Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lee, H., Grosse, R., Ranganath, R., and Ng, A.Y. (2009, January 14\u201318). Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada.","DOI":"10.1145\/1553374.1553453"},{"key":"ref_22","unstructured":"Lin, M., Chen, Q., and Yan, S. (2013). Network in network. arXiv."},{"key":"ref_23","unstructured":"Sanjeev, A., and Aditya Bhaskara, R.G.T.M. (2014, January 21\u201326). Provable bounds for learning some deep representations. Proceedings of the International Conference on Machine Learning, Beijing, China."},{"key":"ref_24","unstructured":"Hope, T., Resheff, Y.S., and Lieder, I. (2017). Learning Tensorflow: A Guide to Building Deep Learning Systems, O\u2019Reilly Media, Inc."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.neucom.2015.07.136","article-title":"A robust local sparse coding method for image classification with Histogram Intersection Kernel","volume":"184","author":"Li","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Bo, L., Ren, X., and Fox, D. (2013, January 23\u201328). Multipath Sparse Coding Using Hierarchical Matching Pursuit. Proceedings of the Computer Vision & Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.91"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1109\/LSP.2012.2228852","article-title":"Reference-Based Scheme Combined With K-SVD for Scene Image Categorization","volume":"20","author":"Li","year":"2013","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_28","unstructured":"Yang, J., Yu, K., Gong, Y., and Huang, T. (2009, January 20\u201325). Linear spatial pyramid matching using sparse coding for image classification. Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zeiler, M.D., and Fergus, R. (2013). Visualizing and understanding convolutional neural networks. arXiv.","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.neucom.2016.01.100","article-title":"Hierarchical learning of large-margin metrics for large-scale image classification","volume":"208","author":"Lei","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_31","unstructured":"Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., and Oliva, A. (2014, January 8\u201313). Learning deep features for scene recognition using places database. Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/j.patcog.2017.01.029","article-title":"Hierarchical learning of multi-task sparse metrics for large-scale image classification","volume":"67","author":"Zheng","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_33","unstructured":"Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. (2013). OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. arXiv."},{"key":"ref_34","unstructured":"Howard A, G. (2013). Some improvements on deep convolutional neural network based image classification. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Larochelle, H., Erhan, D., Courville, A., Bergstra, J., and Bengio, Y. (2007, January 20\u201324). An empirical evaluation of deep architectures on problems with many factors of variation. Proceedings of the 24th International Conference on Machine learning, Corvalis, OR, USA.","DOI":"10.1145\/1273496.1273556"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Cho, K., Raiko, T., and Ilin, A. (2013, January 4\u20139). Gaussian-Bernoulli Deep Boltzmann Machine. Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN), Dallas, TX, USA.","DOI":"10.1109\/IJCNN.2013.6706831"},{"key":"ref_37","unstructured":"Srivastava, R.K., Greff, K., and Schmidhuber, J. (2015). Highway networks. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"106419","DOI":"10.1016\/j.knosys.2020.106419","article-title":"A convolutional neural network with sparse representation","volume":"209","author":"Yang","year":"2020","journal-title":"Knowl.-Based Syst."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1016\/j.neucom.2019.10.007","article-title":"Autonomous deep learning: A genetic DCNN designer for image classification","volume":"379","author":"Ma","year":"2020","journal-title":"Neurocomputing"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/19\/6382\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:04:31Z","timestamp":1760166271000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/19\/6382"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,24]]},"references-count":39,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2021,10]]}},"alternative-id":["s21196382"],"URL":"https:\/\/doi.org\/10.3390\/s21196382","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,9,24]]}}}