{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T18:26:41Z","timestamp":1776277601199,"version":"3.50.1"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,5,6]],"date-time":"2024-05-06T00:00:00Z","timestamp":1714953600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,5,6]],"date-time":"2024-05-06T00:00:00Z","timestamp":1714953600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100013061","name":"Jilin Provincial Scientific and Technological Development Program","doi-asserted-by":"publisher","award":["20230101174JC"],"award-info":[{"award-number":["20230101174JC"]}],"id":[{"id":"10.13039\/501100013061","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100013061","name":"Jilin Provincial Scientific and Technological Development Program","doi-asserted-by":"publisher","award":["20200401090GX"],"award-info":[{"award-number":["20200401090GX"]}],"id":[{"id":"10.13039\/501100013061","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In the field of deep learning, the attention mechanism, as a technology that mimics human perception and attention processes, has made remarkable achievements. The current methods combine a channel attention mechanism and a spatial attention mechanism in a parallel or cascaded manner to enhance the model representational competence, but they do not fully consider the interaction between spatial and channel information. This paper proposes a method in which a space embedded channel module and a channel embedded space module are cascaded to enhance the model\u2019s representational competence. First, in the space embedded channel module, to enhance the representational competence of the region of interest in different spatial dimensions, the input tensor is split into horizontal and vertical branches according to spatial dimensions to alleviate the loss of position information when performing 2D pooling. To smoothly process the features and highlight the local features, four branches are obtained through global maximum and average pooling, and the features are aggregated by different pooling methods to obtain two feature tensors with different pooling methods. To enable the output horizontal and vertical feature tensors to focus on different pooling features simultaneously, the two feature tensors are segmented and dimensionally transposed according to spatial dimensions, and the features are later aggregated along the spatial direction. Then, in the channel embedded space module, for the problem of no cross-channel connection between groups in grouped convolution and for which the parameters are large, this paper uses adaptive grouped banded matrices. Based on the banded matrices utilizing the mapping relationship that exists between the number of channels and the size of the convolution kernels, the convolution kernel size is adaptively computed to achieve adaptive cross-channel interaction, enhancing the correlation between the channel dimensions while ensuring that the spatial dimensions remain unchanged. Finally, the output horizontal and vertical weights are used as attention weights. In the experiment, the attention mechanism module proposed in this paper is embedded into the MobileNetV2 and ResNet networks at different depths, and extensive experiments are conducted on the CIFAR-10, CIFAR-100 and STL-10 datasets. The results show that the method in this paper captures and utilizes the features of the input data more effectively than the other methods, significantly improving the classification accuracy. Despite the introduction of an additional computational burden (0.5\u00a0M), however, the overall performance of the model still achieves the best results when the computational overhead is comprehensively considered.<\/jats:p>","DOI":"10.1007\/s40747-024-01445-9","type":"journal-article","created":{"date-parts":[[2024,5,6]],"date-time":"2024-05-06T06:02:28Z","timestamp":1714975348000},"page":"5427-5444","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":51,"title":["An attention mechanism module with spatial perception and channel information interaction"],"prefix":"10.1007","volume":"10","author":[{"given":"Yifan","family":"Wang","sequence":"first","affiliation":[]},{"given":"Wu","family":"Wang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8314-6448","authenticated-orcid":false,"given":"Yang","family":"Li","sequence":"additional","affiliation":[]},{"given":"Yaodong","family":"Jia","sequence":"additional","affiliation":[]},{"given":"Yu","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Yu","family":"Ling","sequence":"additional","affiliation":[]},{"given":"Jiaqi","family":"Ma","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,5,6]]},"reference":[{"issue":"8","key":"1445_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2023.101821","volume":"97","author":"Z Cristina","year":"2023","unstructured":"Cristina Z, Eugenio MC, Enrique HV, Iyad AK, Francisco H (2023) Explainable crowd decision making methodology guided by expert natural language opinions based on sentiment analysis with attention-based deep learning and subgroup discovery. Inf Fusion 97(8):101821. https:\/\/doi.org\/10.1016\/j.inffus.2023.101821","journal-title":"Inf Fusion"},{"key":"1445_CR2","doi-asserted-by":"publisher","first-page":"6953","DOI":"10.1007\/s40747-023-01106-3","volume":"9","author":"S Zhang","year":"2023","unstructured":"Zhang S, Wei Z, Xu W, Zhang LL, Wang Y, Zhou X, Liu JY (2023) DSC-MVSNet: attention aware cost volume regularization based on depthwise separable convolution for multi-view stereo. Complex Intell 9:6953\u20136969. https:\/\/doi.org\/10.1007\/s40747-023-01106-3","journal-title":"Complex Intell"},{"key":"1445_CR3","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/acc0d5","volume":"4","author":"RK Lakshmi","year":"2023","unstructured":"Lakshmi RK, Rama SA (2023) Novel heuristic-based hybrid ResNeXt with recurrent neural network to handle multi class classification of sentiment analysis. Mach Learn: Sci Technol 4:015033. https:\/\/doi.org\/10.1088\/2632-2153\/acc0d5","journal-title":"Mach Learn: Sci Technol"},{"key":"1445_CR4","doi-asserted-by":"publisher","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: CVPR 7132\u20137141. https:\/\/doi.org\/10.1109\/CVPR.2018.00745","DOI":"10.1109\/CVPR.2018.00745"},{"key":"1445_CR5","doi-asserted-by":"publisher","unstructured":"Wang QL, Wu BG, Zhu PF, Li PH, Zuo WM; Hu QH (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR) 11531-11539. https:\/\/doi.org\/10.1109\/CVPR42600.2020.01155","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"1445_CR6","doi-asserted-by":"publisher","unstructured":"Yang ZX, Zhu LC, Wu Y, Yang Y (2020) Gated channel transformation for visual recognition. 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR) 11794-11803. https:\/\/doi.org\/10.1109\/CVPR42600.2020.01181","DOI":"10.1109\/CVPR42600.2020.01181"},{"key":"1445_CR7","doi-asserted-by":"publisher","unstructured":"Qin ZQ, Zhang PY, Wu F, Li X (2021) Fcanet: Frequency channel attention networks, 2021 IEEE\/CVF international conference on computer vision (ICCV) 763\u2013772, https:\/\/doi.org\/10.1109\/ICCV48922.2021.00082","DOI":"10.1109\/ICCV48922.2021.00082"},{"key":"1445_CR8","doi-asserted-by":"publisher","first-page":"2204","DOI":"10.48550\/arXiv.1406.6247","volume":"2","author":"M Volodymyr","year":"2014","unstructured":"Volodymyr M, Nicolas H, Alex G, Koray K (2014) Recurrent models of visual attention. Neural Inf Process Syst 2:2204\u20132212. https:\/\/doi.org\/10.48550\/arXiv.1406.6247","journal-title":"Neural Inf Process Syst"},{"key":"1445_CR9","doi-asserted-by":"publisher","unstructured":"Max J, Karen S, Andrew Z, Koray Kavukcuoglu (2015) Spatial Transformer Network. NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems 2:2017\u20132025. https:\/\/doi.org\/10.48550\/arXiv.1506.02025","DOI":"10.48550\/arXiv.1506.02025"},{"issue":"6","key":"1445_CR10","doi-asserted-by":"publisher","first-page":"6896","DOI":"10.1109\/TPAMI.2020.3007032","volume":"45","author":"ZL Huang","year":"2019","unstructured":"Huang ZL, Wang XG, Wei YC, Huang LC, Shi H, Liu WY, Thomas SH (2019) Ccnet Crisscross attention for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 45(6):6896\u20136908. https:\/\/doi.org\/10.1109\/TPAMI.2020.3007032","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1445_CR11","doi-asserted-by":"publisher","unstructured":"Park J and Sanghyun W, Lee JY, Kweon IS (2018) Bam: bottleneck attention module. ArXiv. https:\/\/doi.org\/10.48550\/arXiv.1807.06514","DOI":"10.48550\/arXiv.1807.06514"},{"key":"1445_CR12","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2022.108785","author":"GQ Li","year":"2022","unstructured":"Li GQ, Fang Q, Zha LL, Gao X, Zheng NG (2022) HAM: Hybrid attention module in deep convolutional neural networks for image classification. Pattern Recognit J: Pattern Recognit Soc. https:\/\/doi.org\/10.1016\/j.patcog.2022.108785","journal-title":"Pattern Recognit J: Pattern Recognit Soc"},{"key":"1445_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.114770","volume":"178","author":"YB Wang","year":"2021","unstructured":"Wang YB, Wang HF, Peng ZH (2021) Rice diseases detection and classification using attention based neural network and bayesian optimization. Expert Syst Appl 178:114770. https:\/\/doi.org\/10.1016\/j.eswa.2021.114770","journal-title":"Expert Syst Appl"},{"issue":"2","key":"1445_CR14","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1109\/TMI.2018.2867261","volume":"38","author":"GR Abhijit","year":"2019","unstructured":"Abhijit GR, Nassir N, Christian W (2019) Recalibrating fully convolutional networks with spatial and channel \u201cSqueeze and Excitation\u201d blocks. IEEE Trans Med Imaging 38(2):540\u2013549. https:\/\/doi.org\/10.1109\/TMI.2018.2867261","journal-title":"IEEE Trans Med Imaging"},{"key":"1445_CR15","doi-asserted-by":"publisher","unstructured":"Zhang QL, Yang YB (2021) Sa-net: shuffle attention for deep convolutional neural networks. ICASSP 2021\u20132021 IEEE international conference on acoustics, speech and signal processing (ICASSP) 2235\u20132239. https:\/\/doi.org\/10.1109\/ICASSP39728.2021.9414568","DOI":"10.1109\/ICASSP39728.2021.9414568"},{"key":"1445_CR16","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2105.14447","author":"H Zhang","year":"2022","unstructured":"Zhang H, Zu KK, Lu J, Meng DY (2022) EPSANet: an efficient pyramid squeeze attention block on convolutional neural network. Comput Vis Pattern Recognit. https:\/\/doi.org\/10.48550\/arXiv.2105.14447","journal-title":"Comput Vis Pattern Recognit"},{"key":"1445_CR17","doi-asserted-by":"publisher","unstructured":"Hou QB, Zhou DQ, Feng JS (2021) Coordinate attention for efficient mobile network design. 2021 IEEE\/CVF conference on computer vision and pattern recognition (CVPR). 13708\u201313717. https:\/\/doi.org\/10.48550\/arXiv.2103.02907","DOI":"10.48550\/arXiv.2103.02907"},{"key":"1445_CR18","doi-asserted-by":"publisher","DOI":"10.5555\/2969830","author":"CY Le","year":"1990","unstructured":"Le CY, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1990) Handwritten digit recognition with a backpropogation network. Adv Neural Inf Process Syst. https:\/\/doi.org\/10.5555\/2969830","journal-title":"Adv Neural Inf Process Syst"},{"key":"1445_CR19","doi-asserted-by":"publisher","unstructured":"Alex K, Ilya S, Geoffrey EH (2012) Imagenet classification with deep convolutional neural networks. In: 2012 neural information processing systems (NIPS) 25:1097\u20131105. https:\/\/doi.org\/10.1145\/3065386","DOI":"10.1145\/3065386"},{"key":"1445_CR20","doi-asserted-by":"publisher","unstructured":"Karen S, Andrew Z (2015) Very deep convolutional networks for large_scale image recognition. 2015 international conference on learning representations (ICLR). https:\/\/doi.org\/10.48550\/arXiv.1409.1556","DOI":"10.48550\/arXiv.1409.1556"},{"key":"1445_CR21","doi-asserted-by":"publisher","unstructured":"Christian S, Sergey I, Vincent V, Alexander AA (2016). Inception-v4, inception-ResNet and the impact of residual connections on learning. AAAI'17: proceedings of the Thirty-First AAAI conference on artificial intelligence 4278\u20134284 https:\/\/doi.org\/10.48550\/arXiv.1602.07261","DOI":"10.48550\/arXiv.1602.07261"},{"key":"1445_CR22","doi-asserted-by":"publisher","unstructured":"He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR) 7. https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"1445_CR23","doi-asserted-by":"publisher","unstructured":"Andrew GH, Zhu ML, Chen B, Dmitry K, Wang WJ, Tobias W, Andreetto M, Hartwig A (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. ArXiv:1704.04861. https:\/\/doi.org\/10.48550\/arXiv.1704.04861","DOI":"10.48550\/arXiv.1704.04861"},{"key":"1445_CR24","doi-asserted-by":"publisher","unstructured":"Mark S, Andrew H, Zhu ML, Andrey Zhmoginov, Chen LC (2018) Mobilenetv2: inverted residuals and linear bottlenecks. The IEEE conference on computer vision and pattern recognition (CVPR) 4510\u20134520. https:\/\/doi.org\/10.48550\/arXiv.1801.04381","DOI":"10.48550\/arXiv.1801.04381"},{"key":"1445_CR25","doi-asserted-by":"publisher","unstructured":"Andrew H, Mark S, Chu G, Chen LC, Chen B, Tan MX, Wang WJ, Zhu YK, Pang RM, Vijay V, Quoc V L, Hartwig A (2019) Searching for mobilenetv3. 2019 IEEE\/CVF International Conference on Computer Vision (ICCV). https:\/\/doi.org\/10.48550\/arXiv.1905.02244","DOI":"10.48550\/arXiv.1905.02244"},{"key":"1445_CR26","doi-asserted-by":"publisher","first-page":"016518","DOI":"10.1109\/LGRS.2021.3052557","volume":"17","author":"HZ Jin","year":"2023","unstructured":"Jin HZ, Bao ZX, Chang XL, Zhang TT, Chen C (2023) Semantic segmentation of remote sensing images based on dilated convolution and spatial-channel attention mechanism. J Appl Remote Sens 17:016518\u2013016518. https:\/\/doi.org\/10.1109\/LGRS.2021.3052557","journal-title":"J Appl Remote Sens"},{"key":"1445_CR27","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2022.118625","author":"NY Shen","year":"2023","unstructured":"Shen NY, Wang ZY, Li J, Gao HY, Lu W, Hu P, Feng LY (2023) Multi-organ segmentation network for abdominal CT images based on spatial attention and deformable convolution. Expert Syst Appl. https:\/\/doi.org\/10.1016\/j.eswa.2022.118625","journal-title":"Expert Syst Appl"},{"issue":"11","key":"1445_CR28","doi-asserted-by":"publisher","first-page":"13432","DOI":"10.1007\/s10489-022-04170-3","volume":"53","author":"Y Yu","year":"2023","unstructured":"Yu Y, Zhang Y, Song Z, Tanget CK (2023) LMA: lightweight mixed-domain attention for efficient network design. Appl Intell 53(11):13432\u201313451. https:\/\/doi.org\/10.1007\/s10489-022-04170-3","journal-title":"Appl Intell"},{"key":"1445_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.106072","volume":"122","author":"Y Shen","year":"2023","unstructured":"Shen Y, Zheng W, Chen LQ, Huang F (2023) RSHAN: Image super-resolution network based on residual separation hybrid attention module. Eng Appl Artif Intell: Int J Intell Real-Time Autom 122:106072. https:\/\/doi.org\/10.1016\/j.engappai.2023.106072","journal-title":"Eng Appl Artif Intell: Int J Intell Real-Time Autom"},{"key":"1445_CR30","doi-asserted-by":"publisher","first-page":"15143","DOI":"10.1007\/s11042-022-13999-2","volume":"82","author":"MX Jin","year":"2023","unstructured":"Jin MX, Li HF, Xia ZQ (2023) Hybrid attention network and center-guided non-maximum suppression for occluded face detection. Multimed Tools Appl 82:15143\u201315170. https:\/\/doi.org\/10.1007\/s11042-022-13999-2","journal-title":"Multimed Tools Appl"},{"key":"1445_CR31","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.105845","author":"CK Shi","year":"2023","unstructured":"Shi CK, Hao YX, Li GY, Xu SY (2023) EBNAS: efficient binary network design for image classification via neural architecture search. Eng Appl Artif Intell: Int J Intell Real-Time Autom. https:\/\/doi.org\/10.1016\/j.engappai.2023.105845","journal-title":"Eng Appl Artif Intell: Int J Intell Real-Time Autom"},{"key":"1445_CR32","unstructured":"Alex K (2009) Learning multiple layers of features from tiny images. Handbook of systemic autoimmune diseases 1(4). https:\/\/www.cs.toronto.edu\/~kriz\/cifar.html"},{"key":"1445_CR33","first-page":"215","volume":"15","author":"C Adam","year":"2011","unstructured":"Adam C, Honglak L, Andrew Y (2011) An analysis of single-layer networks in unsupervised feature learning. Int Conf Artif Intell Stat 15:215\u2013223","journal-title":"Int Conf Artif Intell Stat"},{"issue":"2","key":"1445_CR34","doi-asserted-by":"publisher","first-page":"336","DOI":"10.1007\/s11263-019-01228-7","volume":"128","author":"RS Ramprasaath","year":"2017","unstructured":"Ramprasaath RS, Michael C, Abhishek D, Ramakrishna V, Devi P, Dhruv B (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. IEEE Int Conf Comput Vis (ICCV). 128(2):336\u2013359. https:\/\/doi.org\/10.1007\/s11263-019-01228-7","journal-title":"IEEE Int Conf Comput Vis (ICCV)."}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01445-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01445-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01445-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T17:24:34Z","timestamp":1721237074000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01445-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,6]]},"references-count":34,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["1445"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01445-9","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,6]]},"assertion":[{"value":"29 November 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 March 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 May 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"On behalf of all authors, the corresponding author states that there is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}