{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:26:10Z","timestamp":1740122770386,"version":"3.37.3"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2023,2,11]],"date-time":"2023-02-11T00:00:00Z","timestamp":1676073600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,2,11]],"date-time":"2023-02-11T00:00:00Z","timestamp":1676073600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Dutch Efficient Deep Learning program"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"published-print":{"date-parts":[[2023,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Attention modules can be added to neural network architectures to improve performance. This work presents an extensive comparison between several efficient attention modules for image classification and object detection, in addition to proposing a novel Attention Bias module with lower computational overhead. All measured attention modules have been efficiently re-implemented, which allows an objective comparison and evaluation of the relationship between accuracy and inference time. Our measurements show that single-image inference time increases far more (5\u201350%) than the increase in FLOPs suggests (0.2\u20133%) for a limited gain in accuracy, making computation cost an important selection criterion. Despite this increase in inference time, adding an attention module can outperform a deeper baseline ResNet in both speed and accuracy. Finally, we investigate the potential of adding attention modules to pretrained networks and show that fine-tuning is possible and superior to training from scratch. The choice of the best attention module strongly depends on the specific ResNet architecture, input resolution, batch size and inference framework.<\/jats:p>","DOI":"10.1007\/s11063-023-11161-z","type":"journal-article","created":{"date-parts":[[2023,2,11]],"date-time":"2023-02-11T09:05:36Z","timestamp":1676106336000},"page":"6797-6813","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Performance-Efficiency Comparisons of Channel Attention Modules for ResNets"],"prefix":"10.1007","volume":"55","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0874-4720","authenticated-orcid":false,"given":"Sander R.","family":"Klomp","sequence":"first","affiliation":[]},{"given":"Rob G. J.","family":"Wijnhoven","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7639-7716","authenticated-orcid":false,"given":"Peter H. N.","family":"de With","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,2,11]]},"reference":[{"key":"11161_CR1","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770\u2013778. Microsoft Research Asia","DOI":"10.1109\/CVPR.2016.90"},{"key":"11161_CR2","unstructured":"Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: 32nd International Conference on Machine Learning, ICML 2015, vol. 1, pp. 448\u2013456"},{"key":"11161_CR3","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00745","author":"J Hu","year":"2018","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. CVPR. https:\/\/doi.org\/10.1109\/CVPR.2018.00745","journal-title":"CVPR"},{"key":"11161_CR4","doi-asserted-by":"crossref","unstructured":"Huang Z, Liang S, Liang M, Yang H (2020) DIANet: dense-and-implicit attention network. In: AAAI, pp. 4206\u20134214. arXiv:1905.10671","DOI":"10.1609\/aaai.v34i04.5842"},{"key":"11161_CR5","unstructured":"Zhang H, Wu C, Zhang Z, Zhu Y, Zhang Z, Lin H, Sun Y, He T, Mueller J, Manmatha R, Li M, Smola A (2020) ResNeSt: Split-Attention Networks. arXiv preprint arXiv:2004.08955"},{"issue":"6","key":"11161_CR6","doi-asserted-by":"publisher","first-page":"2674","DOI":"10.1109\/TCYB.2019.2894261","volume":"50","author":"X Chen","year":"2020","unstructured":"Chen X, Yu J, Wu Z (2020) Temporally identity-aware SSD with attentional LSTM. IEEE Trans Cybern 50(6):2674\u20132686. https:\/\/doi.org\/10.1109\/TCYB.2019.2894261","journal-title":"IEEE Trans Cybern"},{"key":"11161_CR7","doi-asserted-by":"publisher","unstructured":"Xu Z, Zhuang JBQL, Zhou J, Peng S (2018) domain attention model for domain generalization in object detection. pattern recognition and computer vision. PRCV 2018 11259. https:\/\/doi.org\/10.1007\/978-3-030-03341-5","DOI":"10.1007\/978-3-030-03341-5"},{"key":"11161_CR8","doi-asserted-by":"publisher","unstructured":"Wang X, Cai Z, Gao D, Vasconcelos N (2019) Towards universal object detection by domain attention. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7281\u20137290. https:\/\/doi.org\/10.1109\/CVPR.2019.00746","DOI":"10.1109\/CVPR.2019.00746"},{"key":"11161_CR9","doi-asserted-by":"publisher","unstructured":"Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S (2018) Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking. In: CVPR2018, pp. 4854\u20134863. https:\/\/doi.org\/10.1109\/CVPR.2018.00510","DOI":"10.1109\/CVPR.2018.00510"},{"key":"11161_CR10","doi-asserted-by":"crossref","unstructured":"Lee H, Kim H-E, Nam H (2019) SRM : A style-based recalibration module for convolutional neural networks. In: ICCV, pp. 1854\u20131862. arXiv:1903.10829","DOI":"10.1109\/ICCV.2019.00194"},{"key":"11161_CR11","doi-asserted-by":"publisher","unstructured":"Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: Efficient channel attention for deep convolutional neural networks. In: 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11531\u201311539. https:\/\/doi.org\/10.1109\/cvpr42600.2020.01155","DOI":"10.1109\/cvpr42600.2020.01155"},{"key":"11161_CR12","unstructured":"Krizhevsky A, Sutskever I, Hinton GEGE, Sulskever I, Hinton GEGE (2012) ImageNet Classification with Deep Convolutional Neural Networks. In: Advances in Neural Information and Processing Systems (NIPS)"},{"key":"11161_CR13","doi-asserted-by":"publisher","unstructured":"Jia Deng, Wei Dong, Socher R, Li-Jia Li, Kai Li, Li Fei-Fei (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248\u2013255. https:\/\/doi.org\/10.1109\/CVPRW.2009.5206848","DOI":"10.1109\/CVPRW.2009.5206848"},{"key":"11161_CR14","doi-asserted-by":"publisher","unstructured":"Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR. https:\/\/doi.org\/10.1016\/j.infsof.2008.09.005","DOI":"10.1016\/j.infsof.2008.09.005"},{"key":"11161_CR15","doi-asserted-by":"publisher","unstructured":"Xie S, Girshick R, Doll\u00e1r P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5987\u20135995. https:\/\/doi.org\/10.1109\/CVPR.2017.634","DOI":"10.1109\/CVPR.2017.634"},{"key":"11161_CR16","unstructured":"Geirhos R, Michaelis C, Wichmann FA, Rubisch P, Bethge M, Brendel W (2019) ImageNet-trained CNNs are biased towards texture. ICLR, increasing shape bias improves accuracy and robustness"},{"key":"11161_CR17","doi-asserted-by":"publisher","unstructured":"Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2\u20134. https:\/\/doi.org\/10.1109\/ICCV.2017.167. http:\/\/openaccess.thecvf.com\/content_ICCV_2017\/papers\/Huang_Arbitrary_Style_Transfer_ICCV_2017_paper.pdf","DOI":"10.1109\/ICCV.2017.167"},{"key":"11161_CR18","unstructured":"Ulyanov D, Vedaldi A, Lempitsky V (2017) Instance Normalization: The missing ingredient for fast stylization. arXiv:1607.08022"},{"key":"11161_CR19","doi-asserted-by":"crossref","unstructured":"Pan X, Luo P, Shi J, Tang X (2018) Two at Once : enhancing learning and generalization capacities via IBN-Net. In: CVPR","DOI":"10.1007\/978-3-030-01225-0_29"},{"key":"11161_CR20","unstructured":"Hu J, Shen L, Albanie S, Sun G, Vedaldi A (2018) Gather-excite: Exploiting feature context in convolutional neural networks. In: advances in neural information processing systems (NeurIPS), pp. 9401\u20139411"},{"key":"11161_CR21","doi-asserted-by":"crossref","unstructured":"Hu X, Zhang Z, Jiang Z, Chaudhuri S, Yang Z, Nevatia R (2020) SPAN: spatial pyramid attention network for image manipulation localization. In: ECCV2020, pp. 312\u2013328","DOI":"10.1007\/978-3-030-58589-1_19"},{"key":"11161_CR22","doi-asserted-by":"publisher","unstructured":"Jaderberg M, Simonyan K, Zisserman A (2015) spatial transformer networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 2017\u20132025. https:\/\/doi.org\/10.1145\/2948076.2948084","DOI":"10.1145\/2948076.2948084"},{"key":"11161_CR23","doi-asserted-by":"crossref","unstructured":"Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)","DOI":"10.1109\/CVPR.2018.00813"},{"key":"11161_CR24","doi-asserted-by":"crossref","unstructured":"Woo S, Park J, Lee J-y, Kweon IS (2018) CBAM: convolutional block attention module. In: European conference on computer vision (ECCV)","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"11161_CR25","doi-asserted-by":"publisher","unstructured":"Bello I, Zoph B, Le Q, Vaswani A, Shlens J (2019) Attention augmented convolutional networks. In: proceedings of the IEEE international conference on computer vision (CVPR), pp. 3285\u20133294. https:\/\/doi.org\/10.1109\/ICCV.2019.00338","DOI":"10.1109\/ICCV.2019.00338"},{"key":"11161_CR26","doi-asserted-by":"publisher","unstructured":"Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in CNNs. In: CVPR, pp. 6995\u20137003. https:\/\/doi.org\/10.1109\/ICCChina.2012.6356930","DOI":"10.1109\/ICCChina.2012.6356930"},{"key":"11161_CR27","doi-asserted-by":"publisher","unstructured":"Cao Y, Xu J, Lin S, Wei F, Hu H (2019) GCNet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings - 2019 international conference on computer vision workshop, ICCVW, pp. 1971\u20131980. https:\/\/doi.org\/10.1109\/ICCVW.2019.00246","DOI":"10.1109\/ICCVW.2019.00246"},{"key":"11161_CR28","doi-asserted-by":"publisher","unstructured":"Ma X, Guo J, Chen Q, Tang S, Yang Q, Fu S (2020) Attention meets normalization and beyond. In: IEEE international conference on multimedia and expo (ICME). https:\/\/doi.org\/10.1109\/ICME46284.2020.9102909","DOI":"10.1109\/ICME46284.2020.9102909"},{"key":"11161_CR29","doi-asserted-by":"publisher","unstructured":"Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) BDD100K: A diverse driving dataset for heterogeneous multitask learning. In: CVPR 2020, pp. 2633\u20132642. https:\/\/doi.org\/10.1109\/cvpr42600.2020.00271","DOI":"10.1109\/cvpr42600.2020.00271"},{"key":"11161_CR30","doi-asserted-by":"crossref","unstructured":"Microsoft COCO (2014) Lin, T.-Y.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., Zitnick, C.L. Common objects in context. In: ECCV 8693:740\u2013755","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"11161_CR31","doi-asserted-by":"publisher","unstructured":"Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: NeurIPS, pp. 91\u201399. https:\/\/doi.org\/10.1109\/TPAMI.2016.2577031","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"11161_CR32","doi-asserted-by":"publisher","unstructured":"Lin T-Y, Doll\u00e1r P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: CVPR. https:\/\/doi.org\/10.1109\/CVPR.2017.106","DOI":"10.1109\/CVPR.2017.106"},{"key":"11161_CR33","unstructured":"Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, Zhang Z, Cheng D, Zhu C, Cheng T, Zhao Q, Li B, Lu X, Zhu R, Wu Y, Dai J, Wang J, Shi J, Ouyang W, Loy CC, Lin D (2019) MMDetection: Open MMLab detection toolbox and benchmark. arXiv:1906.07155"},{"key":"11161_CR34","doi-asserted-by":"publisher","unstructured":"He K, Girshick R, Dollar P (2019) Rethinking imageNet pre-training. In: proceedings of the IEEE international conference on computer vision (CVPR), pp. 4917\u20134926. https:\/\/doi.org\/10.1109\/ICCV.2019.00502","DOI":"10.1109\/ICCV.2019.00502"},{"key":"11161_CR35","unstructured":"Nam H, Lee H, Park J, Yoon W, Yoo D (2019) Reducing domain gap via style-agnostic networks. In: ICCVW. arXiv:1910.11645"},{"key":"11161_CR36","doi-asserted-by":"crossref","unstructured":"Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-YY, Berg AC (2016) SSD: Single shot multibox detector. In: ECCV, vol. 9905 LNCS, pp. 21\u201337","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"11161_CR37","doi-asserted-by":"publisher","unstructured":"Zhu R, Zhang S, Wang X, Wen L, Shi H, Bo L, Mei T (2019) Scratchdet: Training single-shot object detectors from scratch. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2263\u20132272. https:\/\/doi.org\/10.1109\/CVPR.2019.00237","DOI":"10.1109\/CVPR.2019.00237"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-023-11161-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-023-11161-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-023-11161-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,29]],"date-time":"2023-09-29T16:23:59Z","timestamp":1696004639000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-023-11161-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,11]]},"references-count":37,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["11161"],"URL":"https:\/\/doi.org\/10.1007\/s11063-023-11161-z","relation":{},"ISSN":["1370-4621","1573-773X"],"issn-type":[{"type":"print","value":"1370-4621"},{"type":"electronic","value":"1573-773X"}],"subject":[],"published":{"date-parts":[[2023,2,11]]},"assertion":[{"value":"26 January 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 February 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Outside of the funding discussed in the previous item, there are no additional competing interests for any of the authors that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable, because only publicly available data was used.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"All authors read and approved the final manuscript.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}