{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:52:27Z","timestamp":1777704747280,"version":"3.51.4"},"reference-count":26,"publisher":"SAGE Publications","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IFS"],"published-print":{"date-parts":[[2021,6,21]]},"abstract":"<jats:p>Feature pyramids are commonly applied to solve the scale variation problem for object detection. One of the most representative works of feature pyramid is Feature Pyramid Network (FPN), which is simple and efficient. However, the fully power of multi-scale features might not be completely exploited in FPN due to its design defects. In this paper, we first analyze the structure problems of FPN which prevent the multi-scale feature from being fully exploited, then propose a new feature pyramid structure named Mixed Group FPN (MGFPN), to mitigate these design defects of FPN. Concretely, MGFPN strengthens the feature utilization by two modules named Mixed Group Convolution(MGConv) and Contextual Attention(CA). MGConv reduces the spatial information loss of FPN in feature generation stage. And CA narrows the semantic gaps between features of different receptive field before lateral summation. By replacing FPN with MGFPN in FCOS, our method can improve the performance of detectors in many major backbones by 0.7 to 1.2 Average Precision(AP) on MS-COCO benchmark without adding too much parameters and it is easy to be extended to other FPN-based models. The proposed MGFPN can serve as a simple and strong alternative for many other FPN based models.<\/jats:p>","DOI":"10.3233\/jifs-202372","type":"journal-article","created":{"date-parts":[[2021,3,16]],"date-time":"2021-03-16T13:40:04Z","timestamp":1615902004000},"page":"11171-11181","source":"Crossref","is-referenced-by-count":1,"title":["MGFPN: Enhancing multi-scale feature for object detection"],"prefix":"10.1177","volume":"40","author":[{"given":"Weiming","family":"He","sequence":"first","affiliation":[{"name":"School of Computer Science, South China Normal University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"You","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Mathematics and Statistics, Hunan Normal University, Changsha, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jing","family":"Xiao","sequence":"additional","affiliation":[{"name":"School of Computer Science, South China Normal University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yang","family":"Cao","sequence":"additional","affiliation":[{"name":"School of Computer Science, South China Normal University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","reference":[{"issue":"4","key":"10.3233\/JIFS-202372_ref2","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE transactions on pattern analysis and machine intelligence"},{"key":"10.3233\/JIFS-202372_ref3","doi-asserted-by":"crossref","unstructured":"Chollet F. , Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), 1251\u20131258.","DOI":"10.1109\/CVPR.2017.195"},{"key":"10.3233\/JIFS-202372_ref4","doi-asserted-by":"crossref","unstructured":"Deng J. , Dong W. , Socher R. , Li L.-J. , Li K. and Fei-Fei L. , Imagenet: A large-scale hierarchical image database, In 2009 IEEE conference on computer vision and pattern recognition 248\u2013255. Ieee, (2009).","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"10.3233\/JIFS-202372_ref5","doi-asserted-by":"crossref","unstructured":"Duan K. , Bai S. , Xie L. , Qi H. , Huang Q. and Tian Q. , Centernet: Keypoint triplets for object detection, In Proceedings of the IEEE International Conference on Computer Vision (2019), 6569\u20136578.","DOI":"10.1109\/ICCV.2019.00667"},{"key":"10.3233\/JIFS-202372_ref7","doi-asserted-by":"crossref","unstructured":"Guo C. , Fan B. , Zhang Q. , Xiang S. and Pan C. , Augfpn: Improving multi-scale feature learning for object detection, In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (2020), 12595\u201312604.","DOI":"10.1109\/CVPR42600.2020.01261"},{"key":"10.3233\/JIFS-202372_ref8","doi-asserted-by":"crossref","unstructured":"He K. , Gkioxari G. , Doll\u00e1r P. and Girshick R. , Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (2017), 2961\u20132969.","DOI":"10.1109\/ICCV.2017.322"},{"key":"10.3233\/JIFS-202372_ref9","doi-asserted-by":"crossref","unstructured":"He K. , Zhang X. , Ren S. and Sun J. , Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), 770\u2013778.","DOI":"10.1109\/CVPR.2016.90"},{"key":"10.3233\/JIFS-202372_ref10","unstructured":"Hu J. , Shen L. , Albanie S. , Sun G. and Vedaldi A. , Gather-excite: Exploiting feature context in convolutional neural networks, In Advances in neural information processing systems (2018), 9401\u20139411."},{"key":"10.3233\/JIFS-202372_ref11","doi-asserted-by":"crossref","unstructured":"Hu J. , Shen L. and Sun G. , Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018), 7132\u20137141.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"10.3233\/JIFS-202372_ref15","doi-asserted-by":"crossref","unstructured":"Law H. and Deng J. , Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), 734\u2013750.","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"10.3233\/JIFS-202372_ref16","doi-asserted-by":"crossref","unstructured":"Li X. , Wang W. , Hu X. and Yang J. , Selective kernel networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2019), 510\u2013519.","DOI":"10.1109\/CVPR.2019.00060"},{"key":"10.3233\/JIFS-202372_ref17","doi-asserted-by":"crossref","unstructured":"Lin T.-Y. , Goyal P. , Girshick R. , He K. and Doll\u00e1r P. , Focal loss for dense object detection, In Proceedings of the IEEE international conference on computer vision (2017), 2980\u20132988.","DOI":"10.1109\/ICCV.2017.324"},{"key":"10.3233\/JIFS-202372_ref18","doi-asserted-by":"crossref","unstructured":"Liu W. , Anguelov D. , Erhan D. , Szegedy C. , Reed S. , Fu C.-Y. and Berg A.C. , Ssd: Single shot multibox detector, In European conference on computer vision 21\u201337. Springer, (2016).","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"10.3233\/JIFS-202372_ref19","doi-asserted-by":"crossref","unstructured":"Redmon J. , Divvala S. , Girshick R. and Farhadi A. , You only look once: Unified, real-time object detection, In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), 779\u2013788.","DOI":"10.1109\/CVPR.2016.91"},{"key":"10.3233\/JIFS-202372_ref20","doi-asserted-by":"crossref","unstructured":"Redmon J. and Farhadi A. , Yolo9000: better, faster, stronger, In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), 7263\u20137271.","DOI":"10.1109\/CVPR.2017.690"},{"key":"10.3233\/JIFS-202372_ref23","doi-asserted-by":"crossref","unstructured":"Rezatofighi H. , Tsoi N. and Gwak J.Y. , Amir Sadeghian, Ian Reid, and Silvio Savarese, Generalized intersection over union: A metric and a loss for bounding box regression, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), 658\u2013666.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"10.3233\/JIFS-202372_ref25","doi-asserted-by":"crossref","unstructured":"Szegedy C. , Liu W. , Jia Y. , Sermanet P. , Reed S. , Anguelov D. , Erhan D. , Vanhoucke V. and Rabinovich A. , Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (2015), 1\u20139.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"10.3233\/JIFS-202372_ref27","doi-asserted-by":"crossref","unstructured":"Tan M. , Pang R. and Le Q.V. , Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (2020), 10781\u201310790.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"10.3233\/JIFS-202372_ref28","doi-asserted-by":"crossref","unstructured":"Tian Z. , Shen C. , Chen H. and He T. , Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE international conference on computer vision (2019), 9627\u20139636.","DOI":"10.1109\/ICCV.2019.00972"},{"issue":"2","key":"10.3233\/JIFS-202372_ref29","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1007\/s11263-006-7538-0","article-title":"A critical view of context","volume":"69","author":"Wolf","year":"2006","journal-title":"International Journal of Computer Vision"},{"key":"10.3233\/JIFS-202372_ref30","doi-asserted-by":"crossref","unstructured":"Woo S. , Park J. , Lee J.-Y. and Kweon I.S. , Cbam: Convolutional block attention module, In Proceedings of the European conference on computer vision (ECCV) (2018), 3\u201319.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"10.3233\/JIFS-202372_ref31","doi-asserted-by":"crossref","unstructured":"Xie S. , Girshick R. , Doll\u00e1r P. , Tu Z. and He K. , Aggregated residual transformations for deep neural networks, In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), 1492\u20131500.","DOI":"10.1109\/CVPR.2017.634"},{"key":"10.3233\/JIFS-202372_ref32","doi-asserted-by":"crossref","unstructured":"Zhang S. , Chi C. , Yao Y. , Lei Z. and Li S.Z. , Bridging the gap between anchor-based and anchorfree detection via adaptive training sample selection, In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (2020), 9759\u20139768.","DOI":"10.1109\/CVPR42600.2020.00978"},{"key":"10.3233\/JIFS-202372_ref33","doi-asserted-by":"crossref","unstructured":"Zhang T. , Qi G.-J. , Xiao B. and Wang J. , Interleaved group convolutions, In Proceedings of the IEEE international conference on computer vision (2017), 4373\u20134382.","DOI":"10.1109\/ICCV.2017.469"},{"key":"10.3233\/JIFS-202372_ref34","doi-asserted-by":"crossref","unstructured":"Zhao H. , Shi J. , Qi X. , Wang X. and Jia J. , Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), 2881\u20132890.","DOI":"10.1109\/CVPR.2017.660"},{"key":"10.3233\/JIFS-202372_ref35","doi-asserted-by":"crossref","unstructured":"Zhu C. , He Y. and Savvides M. , Feature selective anchor-free module for single-shot object detection, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), 840\u2013849.","DOI":"10.1109\/CVPR.2019.00093"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JIFS-202372","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:41:55Z","timestamp":1777455715000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JIFS-202372"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,21]]},"references-count":26,"journal-issue":{"issue":"6"},"URL":"https:\/\/doi.org\/10.3233\/jifs-202372","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,21]]}}}