{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,19]],"date-time":"2026-06-19T02:18:46Z","timestamp":1781835526429,"version":"3.54.5"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,5,31]],"date-time":"2023-05-31T00:00:00Z","timestamp":1685491200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2022ZD0160400"],"award-info":[{"award-number":["2022ZD0160400"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62206031, 62271346"],"award-info":[{"award-number":["62206031, 62271346"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"crossref","award":["2021M700613, 2022M720581"],"award-info":[{"award-number":["2021M700613, 2022M720581"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100019065","name":"Tianjin Science and Technology Program","doi-asserted-by":"crossref","award":["19ZXZNGX00050"],"award-info":[{"award-number":["19ZXZNGX00050"]}],"id":[{"id":"10.13039\/501100019065","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>\n            The way of constructing a robust feature pyramid is crucial for object detection. However, existing feature pyramid methods, which aggregate multi-level features by using element-wise sum or concatenation, are inefficient to construct a robust feature pyramid. The reason is that these methods cannot be effective in discriminating the relevant semantics of objects. In this article, we propose a Complementary Feature Pyramid Network (CFPN) to aggregate multi-level features selectively and efficiently by exploring complementary information between multi-level features. Specifically, a Spatial Complementary Module (SCM) and a Channel Complementary Module (CCM) are designed and embedded in CFPN to enhance useful information and suppress irrelevant information during feature fusions along spatial and channel dimensions, respectively. CFPN is a generic feature extractor, as evidenced by its seamless integration into single-stage, two-stage, and end-to-end object detectors. Experiments conducted on the COCO and Pascal VOC datasets demonstrate that integrating our CFPN into RetinaNet, Faster RCNN, Cascade RCNN, and Sparse RCNN obtains consistent performance improvements with negligible overheads. Code and models are available at:\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"url\" xlink:href=\"https:\/\/github.com\/VIPLab-CQU\/CFPN\">https:\/\/github.com\/VIPLab-CQU\/CFPN<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3584362","type":"journal-article","created":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T23:15:29Z","timestamp":1676502929000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Complementary Feature Pyramid Network for Object Detection"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6978-8834","authenticated-orcid":false,"given":"Jin","family":"Xie","sequence":"first","affiliation":[{"name":"Tianjin University and Chongqing University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6670-3727","authenticated-orcid":false,"given":"Yanwei","family":"Pang","sequence":"additional","affiliation":[{"name":"Tianjin University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5178-2247","authenticated-orcid":false,"given":"Jing","family":"Pan","sequence":"additional","affiliation":[{"name":"Tianjin University of Technology and Education"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9872-9286","authenticated-orcid":false,"given":"Jing","family":"Nie","sequence":"additional","affiliation":[{"name":"Chongqing University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5160-6841","authenticated-orcid":false,"given":"Jiale","family":"Cao","sequence":"additional","affiliation":[{"name":"Tianjin University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4361-956X","authenticated-orcid":false,"given":"Jungong","family":"Han","sequence":"additional","affiliation":[{"name":"The University of Sheffield"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,5,31]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"354","volume-title":"Proceedings of the 14th European Conference on Computer Vision.","author":"Cai Zhaowei","year":"2016","unstructured":"Zhaowei Cai, Quanfu Fan, Rogerio S. Feris, and Nuno Vasconcelos. 2016. A unified multi-scale deep convolutional neural network for fast object detection. In Proceedings of the 14th European Conference on Computer Vision.354\u2013370."},{"key":"e_1_3_1_3_2","first-page":"6154","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Cai Zhaowei","year":"2018","unstructured":"Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade R-CNN: Delving into high quality object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.6154\u20136162."},{"key":"e_1_3_1_4_2","first-page":"11482","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Cao Jiale","year":"2020","unstructured":"Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, and Ling Shao. 2020. D2Det: Towards high quality object detection and instance segmentation. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.11482\u201311491."},{"key":"e_1_3_1_5_2","first-page":"213","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Carion Nicolas","year":"2020","unstructured":"Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm. 2020. End-to-end object detection with transformers. In Proceedings of the Eur. Conf. Comput. Vis.213\u2013229."},{"key":"e_1_3_1_6_2","article-title":"MMDetection: Open MMLab detection toolbox and benchmark","author":"Chen Kai","year":"2019","unstructured":"Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. 2019. MMDetection: Open MMLab detection toolbox and benchmark. arXiv:1906.07155 (2019). Retrieved from https:\/\/arxiv.org\/abs\/1906.07155.","journal-title":"arXiv:1906.07155"},{"key":"e_1_3_1_7_2","first-page":"379","volume-title":"Proceedings of the Adv. Neural Inform. Process. Syst.","author":"Dai Jifeng","year":"2016","unstructured":"Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-fcn: Object detection via region-based fully convolutional networks. In Proceedings of the Adv. Neural Inform. Process. Syst.379\u2013387."},{"key":"e_1_3_1_8_2","first-page":"6569","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Duan Kaiwen","year":"2019","unstructured":"Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, and Qi Tian. 2019. CenterNet: Keypoint triplets for object detection. In Proceedings of the Int. Conf. Comput. Vis.6569\u20136578."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-009-0275-4"},{"key":"e_1_3_1_10_2","first-page":"7036","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Ghiasi Golnaz","year":"2019","unstructured":"Golnaz Ghiasi, Tsung-Yi Lin, and Quoc V. Le. 2019. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.7036\u20137045."},{"key":"e_1_3_1_11_2","first-page":"1440","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Girshick Ross","year":"2015","unstructured":"Ross Girshick. 2015. Fast r-cnn. In Proceedings of the Int. Conf. Comput. Vis.1440\u20131448."},{"key":"e_1_3_1_12_2","first-page":"580","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Girshick Ross","year":"2014","unstructured":"Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.580\u2013587."},{"key":"e_1_3_1_13_2","first-page":"12595","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Guo Chaoxu","year":"2020","unstructured":"Chaoxu Guo, Bin Fan, Qian Zhang, Shiming Xiang, and Chunhong Pan. 2020. AugFPN: Improving multi-scale feature learning for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.12595\u201312604."},{"key":"e_1_3_1_14_2","first-page":"1026","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"He Kaiming","year":"2015","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the Int. Conf. Comput. Vis.1026\u20131034."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2389824"},{"key":"e_1_3_1_16_2","first-page":"770","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.770\u2013778."},{"key":"e_1_3_1_17_2","first-page":"239","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Kim Seung-Wook","year":"2018","unstructured":"Seung-Wook Kim, Hyong-Keun Kook, Jee-Young Sun, Mun-Cheon Kang, and Sung-Jea Ko. 2018. Parallel feature pyramid network for object detection. In Proceedings of the Eur. Conf. Comput. Vis.239\u2013256."},{"key":"e_1_3_1_18_2","first-page":"172","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Kong Tao","year":"2018","unstructured":"Tao Kong, Fuchun Sun, Chuanqi Tan, Huaping Liu, and Wenbing Huang. 2018. Deep feature pyramid reconfiguration for object detection. In Proceedings of the Eur. Conf. Comput. Vis.172\u2013188."},{"key":"e_1_3_1_19_2","first-page":"5244","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Kong Tao","year":"2017","unstructured":"Tao Kong, Fuchun Sun, Anbang Yao, Huaping Liu, Ming Lu, and Yurong Chen. 2017. Ron: Reverse connection with objectness prior networks for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.5244\u20135252."},{"key":"e_1_3_1_20_2","first-page":"765","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Law Hei","year":"2018","unstructured":"Hei Law and Jia Deng. 2018. CornerNet: Detecting objects as paired keypoints. In Proceedings of the Eur. Conf. Comput. Vis.765\u2013781."},{"key":"e_1_3_1_21_2","first-page":"8577","volume-title":"Proceedings of the AAAI","author":"Li Buyu","year":"2019","unstructured":"Buyu Li, Yu Liu, and Xiaogang Wang. 2019. Gradient harmonized single-stage detector. In Proceedings of the AAAI. 8577\u20138584."},{"key":"e_1_3_1_22_2","first-page":"11632","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Li Xiang","year":"2021","unstructured":"Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, and Jian Yang. 2021. Generalized focal loss V2: Learning reliable localization quality estimation for dense object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.11632\u201311641."},{"key":"e_1_3_1_23_2","first-page":"936","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin, Piotr Doll\u00e1r, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.936\u2013944."},{"key":"e_1_3_1_24_2","first-page":"2999","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2017. Focal loss for dense object detection. In Proceedings of the Int. Conf. Comput. Vis.2999\u20133007."},{"key":"e_1_3_1_25_2","first-page":"740","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Lin Tsung-Yi","year":"2014","unstructured":"Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the Eur. Conf. Comput. Vis.740\u2013755."},{"key":"e_1_3_1_26_2","first-page":"8759","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Liu Shu","year":"2018","unstructured":"Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.8759\u20138768."},{"key":"e_1_3_1_27_2","first-page":"21","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Liu Wei","year":"2016","unstructured":"Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In Proceedings of the Eur. Conf. Comput. Vis.21\u201337."},{"key":"e_1_3_1_28_2","first-page":"9992","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Liu Ze","year":"2021","unstructured":"Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the Int. Conf. Comput. Vis.9992\u201310002."},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3381086"},{"key":"e_1_3_1_30_2","first-page":"821","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Pang Jiangmiao","year":"2019","unstructured":"Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, and Dahua Lin. 2019. Libra R-CNN: Towards balanced learning for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.821\u2013830."},{"key":"e_1_3_1_31_2","first-page":"7336","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Pang Yanwei","year":"2019","unstructured":"Yanwei Pang, Tiancai Wang, Rao Muhammad Anwer, Fahad Shahbaz Khan, and Ling Shao. 2019. Efficient featurized image pyramid network for single shot detector. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.7336\u20137344."},{"key":"e_1_3_1_32_2","volume-title":"Proceedings of the NIPS-W","author":"Paszke Adam","year":"2017","unstructured":"Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the NIPS-W."},{"key":"e_1_3_1_33_2","first-page":"6517","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Redmon Joseph","year":"2017","unstructured":"Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.6517\u20136525."},{"key":"e_1_3_1_34_2","article-title":"Yolov3: An incremental improvement","author":"Redmon Joseph","year":"2018","unstructured":"Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv:1804.02767 (2018). Retrieved from https:\/\/arxiv.org\/abs\/1804.02767.","journal-title":"arXiv:1804.02767"},{"key":"e_1_3_1_35_2","first-page":"91","volume-title":"Proceedings of the Adv. Neural Inform. Process. Syst.","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Adv. Neural Inform. Process. Syst.91\u201399."},{"key":"e_1_3_1_36_2","first-page":"14454","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Sun Peize","year":"2021","unstructured":"Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, and Ping Luo. 2021. Sparse R-CNN: End-to-end object detection with learnable proposals. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.14454\u201314463."},{"key":"e_1_3_1_37_2","first-page":"9627","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Tian Zhi","year":"2019","unstructured":"Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. FCOS: Fully convolutional one-stage object detection. In Proceedings of the Int. Conf. Comput. Vis.9627\u20139636."},{"key":"e_1_3_1_38_2","first-page":"15849","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Wang Jianfeng","year":"2021","unstructured":"Jianfeng Wang, Lin Song, Zeming Li, Hongbin Sun, Jian Sun, and Nanning Zheng. 2021. End-to-end object detection with fully convolutional network. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.15849\u201315858."},{"key":"e_1_3_1_39_2","first-page":"13359","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Wang Xinjiang","year":"2020","unstructured":"Xinjiang Wang, Shilong Zhang, Zhuoran Yu, Litong Feng, and Wei Zhang. 2020. Scale-equalizing Pyramid Convolution for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.13359\u201313368."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3462219"},{"key":"e_1_3_1_41_2","first-page":"3","volume-title":"Proceedings of the Eur. Conf. Comput. Vis.","author":"Woo Sanghyun","year":"2018","unstructured":"Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of the Eur. Conf. Comput. Vis.3\u201319."},{"key":"e_1_3_1_42_2","first-page":"10186","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Wu Yue","year":"2020","unstructured":"Yue Wu, Yinpeng Chen, Lu Yuan, Zicheng Liu, Lijuan Wang, Hongzhi Li, and Yun Fu. 2020. Rethinking classification and localization for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.10186\u201310195."},{"key":"e_1_3_1_43_2","first-page":"5987","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Xie Saining","year":"2017","unstructured":"Saining Xie, Ross Girshick, Piotr Doll\u00e1r, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.5987\u20135995."},{"key":"e_1_3_1_44_2","first-page":"6649","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Xu Hang","year":"2019","unstructured":"Hang Xu, Lewei Yao, Wei Zhang, Xiaodan Liang, and Zhenguo Li. 2019. Auto-FPN: Automatic network architecture adaptation for object detection beyond classification. In Proceedings of the Int. Conf. Comput. Vis.6649\u20136658."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3472393"},{"key":"e_1_3_1_46_2","first-page":"9759","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Zhang Shifeng","year":"2020","unstructured":"Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, and Stan Z. Li. 2020. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.9759\u20139768."},{"key":"e_1_3_1_47_2","first-page":"4203","volume-title":"Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.","author":"Zhang Shifeng","year":"2018","unstructured":"Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, and Stan Z. Li. 2018. Single-shot refinement neural network for object detection. In Proceedings of the IEEE Conf. Comput. Vis. Pattern Recog.4203\u20134212."},{"key":"e_1_3_1_48_2","first-page":"2763","volume-title":"Proceedings of the Int. Conf. Comput. Vis.","author":"Zhao Gangming","year":"2021","unstructured":"Gangming Zhao, Weifeng Ge, and Yizhou Yu. 2021. GraphFPN: Graph feature pyramid network for object detection. In Proceedings of the Int. Conf. Comput. Vis.2763\u20132772."},{"key":"e_1_3_1_49_2","volume-title":"arXiv:1904.07850","author":"Zhou Xingyi","year":"2019","unstructured":"Xingyi Zhou, Dequan Wang, and Philipp Kr\u00e4henb\u00fchl. 2019. Objects as points. arXiv:1904.07850. Retrieved from https:\/\/arxiv.org\/abs\/1904.07850."},{"key":"e_1_3_1_50_2","article-title":"Deformable detr: Deformable transformer for end-to-end object detection","author":"Zhu Xizhou","year":"2021","unstructured":"Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. 2021. Deformable detr: Deformable transformer for end-to-end object detection. In Int. Conf. Learn. Represent.","journal-title":"In Int. Conf. Learn. Represent."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3584362","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3584362","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:54Z","timestamp":1750182534000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3584362"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,31]]},"references-count":49,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3584362"],"URL":"https:\/\/doi.org\/10.1145\/3584362","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,31]]},"assertion":[{"value":"2022-06-12","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-01-28","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}