{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,25]],"date-time":"2026-07-25T16:15:28Z","timestamp":1784996128842,"version":"3.55.0"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,24]],"date-time":"2021-08-24T00:00:00Z","timestamp":1629763200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,24]]},"DOI":"10.1145\/3460426.3463588","type":"proceedings-article","created":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T22:50:29Z","timestamp":1630536629000},"page":"481-485","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":36,"title":["NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection"],"prefix":"10.1145","author":[{"given":"Zekun","family":"Luo","sequence":"first","affiliation":[{"name":"Youtu Lab &amp; Tencent, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zheng","family":"Fang","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sixiao","family":"Zheng","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yabiao","family":"Wang","sequence":"additional","affiliation":[{"name":"Youtu Lab &amp; Tencent, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yanwei","family":"Fu","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2021,9]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Navaneeth Bodla Bharat Singh Rama Chellappa and Larry S Davis. 2017. Soft-NMS--Improving Object Detection With One Line of Code. In ICCV. 5561--5569. Navaneeth Bodla Bharat Singh Rama Chellappa and Larry S Davis. 2017. Soft-NMS--Improving Object Detection With One Line of Code. In ICCV. 5561--5569.","DOI":"10.1109\/ICCV.2017.593"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Garrick Brazil and Xiaoming Liu. 2019. Pedestrian Detection with Autoregressive Network Phases. In CVPR. 7231--7240. Garrick Brazil and Xiaoming Liu. 2019. Pedestrian Detection with Autoregressive Network Phases. In CVPR. 7231--7240.","DOI":"10.1109\/CVPR.2019.00740"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Garrick Brazil Xi Yin and Xiaoming Liu. 2017. Illuminating pedestrians via simultaneous detection & segmentation. In ICCV. 4950--4959. Garrick Brazil Xi Yin and Xiaoming Liu. 2017. Illuminating pedestrians via simultaneous detection & segmentation. In ICCV. 4950--4959.","DOI":"10.1109\/ICCV.2017.530"},{"key":"e_1_3_2_1_4_1","volume-title":"A unified multi-scale deep convolutional neural network for fast object detection","author":"Cai Zhaowei","unstructured":"Zhaowei Cai , Quanfu Fan , Rogerio S Feris , and Nuno Vasconcelos . 2016. A unified multi-scale deep convolutional neural network for fast object detection . In ECCV. Springer , 354--370. Zhaowei Cai, Quanfu Fan, Rogerio S Feris, and Nuno Vasconcelos. 2016. A unified multi-scale deep convolutional neural network for fast object detection. In ECCV. Springer, 354--370."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Cheng Chi Shifeng Zhang Junliang Xing Zhen Lei Stan Z Li and Xudong Zou. 2020 a. Relational learning for joint head and human detection. In AAAI. 10647--10654. Cheng Chi Shifeng Zhang Junliang Xing Zhen Lei Stan Z Li and Xudong Zou. 2020 a. Relational learning for joint head and human detection. In AAAI. 10647--10654.","DOI":"10.1609\/aaai.v34i07.6691"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Cheng Chi Shifeng Zhang Junliang Xing Zhen Lei Stan Z Li Xudong Zou etal 2020 b. PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes.. In AAAI. 10639--10646. Cheng Chi Shifeng Zhang Junliang Xing Zhen Lei Stan Z Li Xudong Zou et al. 2020 b. PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes.. In AAAI. 10639--10646.","DOI":"10.1609\/aaai.v34i07.6690"},{"key":"e_1_3_2_1_7_1","unstructured":"Xuangeng Chu Anlin Zheng Xiangyu Zhang and Jian Sun. 2020. Detection in Crowded Scenes: One Proposal Multiple Predictions. In CVPR. 12214--12223. Xuangeng Chu Anlin Zheng Xiangyu Zhang and Jian Sun. 2020. Detection in Crowded Scenes: One Proposal Multiple Predictions. In CVPR. 12214--12223."},{"key":"e_1_3_2_1_8_1","volume-title":"Support-vector networks. Machine learning","author":"Cortes Corinna","year":"1995","unstructured":"Corinna Cortes and Vladimir Vapnik . 1995. Support-vector networks. Machine learning , Vol. 20 , 3 ( 1995 ), 273--297. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning, Vol. 20, 3 (1995), 273--297."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In CVPR. 886--893. Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In CVPR. 886--893.","DOI":"10.1109\/CVPR.2005.177"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2300479"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Piotr Doll\u00e1r Zhuowen Tu Pietro Perona and Serge Belongie. 2009. Integral channel features. In BMVC . Piotr Doll\u00e1r Zhuowen Tu Pietro Perona and Serge Belongie. 2009. Integral channel features. In BMVC .","DOI":"10.5244\/C.23.91"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"crossref","unstructured":"P Dollar C Wojek B Schiele and P Perona. 2009. Pedestrian detection: A benchmark. In CVPR. 304--311. P Dollar C Wojek B Schiele and P Perona. 2009. Pedestrian detection: A benchmark. In CVPR. 304--311.","DOI":"10.1109\/CVPR.2009.5206631"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.155"},{"key":"e_1_3_2_1_14_1","volume-title":"Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection","author":"Du Xianzhi","unstructured":"Xianzhi Du , Mostafa El-Khamy , Jungwon Lee , and Larry Davis . 2017. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection . In WACV. IEEE , 953--961. Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, and Larry Davis. 2017. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection. In WACV. IEEE, 953--961."},{"key":"e_1_3_2_1_15_1","volume-title":"Cascade object detection with deformable part models","author":"Felzenszwalb Pedro F","unstructured":"Pedro F Felzenszwalb , Ross B Girshick , and David McAllester . 2010. Cascade object detection with deformable part models . In CVPR. IEEE , 2241--2248. Pedro F Felzenszwalb, Ross B Girshick, and David McAllester. 2010. Cascade object detection with deformable part models. In CVPR. IEEE, 2241--2248."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2009.167"},{"key":"e_1_3_2_1_17_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778. Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Xin Huang Zheng Ge Zequn Jie and Osamu Yoshie. 2020. NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing. In CVPR. 10750--10759. Xin Huang Zheng Ge Zequn Jie and Osamu Yoshie. 2020. NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing. In CVPR. 10750--10759.","DOI":"10.1109\/CVPR42600.2020.01076"},{"key":"e_1_3_2_1_19_1","unstructured":"Songtao Liu Di Huang and Yunhong Wang. 2019 a. Adaptive NMS: Refining Pedestrian Detection in a Crowd. In CVPR. 6459--6468. Songtao Liu Di Huang and Yunhong Wang. 2019 a. Adaptive NMS: Refining Pedestrian Detection in a Crowd. In CVPR. 6459--6468."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Wei Liu Shengcai Liao Weidong Hu Xuezhi Liang and Xiao Chen. 2018. Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In ECCV. 618--634. Wei Liu Shengcai Liao Weidong Hu Xuezhi Liang and Xiao Chen. 2018. Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In ECCV. 618--634.","DOI":"10.1007\/978-3-030-01264-9_38"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Wei Liu Shengcai Liao Weiqiang Ren Weidong Hu and Yinan Yu. 2019 b. High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection. In CVPR. 5187--5196. Wei Liu Shengcai Liao Weiqiang Ren Weidong Hu and Yinan Yu. 2019 b. High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection. In CVPR. 5187--5196.","DOI":"10.1109\/CVPR.2019.00533"},{"key":"e_1_3_2_1_22_1","volume-title":"Where","author":"Luo Yan","unstructured":"Yan Luo , Chongyang Zhang , Muming Zhao , Hao Zhou , and Jun Sun . 2020. Where , What, Whether : Multi-Modal Learning Meets Pedestrian Detection. In CVPR. 14065--14073. Yan Luo, Chongyang Zhang, Muming Zhao, Hao Zhou, and Jun Sun. 2020. Where, What, Whether: Multi-Modal Learning Meets Pedestrian Detection. In CVPR. 14065--14073."},{"key":"e_1_3_2_1_23_1","unstructured":"Jiayuan Mao Tete Xiao Yuning Jiang and Zhimin Cao. 2017. What can help pedestrian detection?. In CVPR. 3127--3136. Jiayuan Mao Tete Xiao Yuning Jiang and Zhimin Cao. 2017. What can help pedestrian detection?. In CVPR. 3127--3136."},{"key":"e_1_3_2_1_24_1","unstructured":"Woonhyun Nam Piotr Doll\u00e1r and Joon Hee Han. 2014. Local decorrelation for improved pedestrian detection. In NIPS. 424--432. Woonhyun Nam Piotr Doll\u00e1r and Joon Hee Han. 2014. Local decorrelation for improved pedestrian detection. In NIPS. 424--432."},{"key":"e_1_3_2_1_25_1","volume-title":"Rao Muhammad Anwer, Fahad Shahbaz Khan, and Ling Shao.","author":"Pang Yanwei","year":"2019","unstructured":"Yanwei Pang , Jin Xie , Muhammad Haris Khan , Rao Muhammad Anwer, Fahad Shahbaz Khan, and Ling Shao. 2019 . Mask-Guided Attention Network for Occluded Pedestrian Detection. In ICCV. 4967--4975. Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, and Ling Shao. 2019. Mask-Guided Attention Network for Occluded Pedestrian Detection. In ICCV. 4967--4975."},{"key":"e_1_3_2_1_26_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017). Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Jimmy Ren Xiaohao Chen Jianbo Liu Wenxiu Sun Jiahao Pang Qiong Yan Yu-Wing Tai and Li Xu. 2017. Accurate single stage detector using recurrent rolling convolution. In CVPR. 5420--5428. Jimmy Ren Xiaohao Chen Jianbo Liu Wenxiu Sun Jiahao Pang Qiong Yan Yu-Wing Tai and Li Xu. 2017. Accurate single stage detector using recurrent rolling convolution. In CVPR. 5420--5428.","DOI":"10.1109\/CVPR.2017.87"},{"key":"e_1_3_2_1_28_1","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS. 91--99. Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS. 91--99."},{"key":"e_1_3_2_1_29_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ). Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_30_1","unstructured":"Paul Viola and Michael Jones. 2001. Rapid object detection using a boosted cascade of simple features. In CVPR. I--I. Paul Viola and Michael Jones. 2001. Rapid object detection using a boosted cascade of simple features. In CVPR. I--I."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Xinlong Wang Tete Xiao Yuning Jiang Shuai Shao Jian Sun and Chunhua Shen. 2018. Repulsion loss: Detecting pedestrians in a crowd. In CVPR. 7774--7783. Xinlong Wang Tete Xiao Yuning Jiang Shuai Shao Jian Sun and Chunhua Shen. 2018. Repulsion loss: Detecting pedestrians in a crowd. In CVPR. 7774--7783.","DOI":"10.1109\/CVPR.2018.00811"},{"key":"e_1_3_2_1_32_1","unstructured":"Jialian Wu Chunluan Zhou Ming Yang Qian Zhang Yuan Li and Junsong Yuan. 2020. Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians. In CVPR. 13430--13439. Jialian Wu Chunluan Zhou Ming Yang Qian Zhang Yuan Li and Junsong Yuan. 2020. Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians. In CVPR. 13430--13439."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.2296528"},{"key":"e_1_3_2_1_34_1","unstructured":"Junjie Yan Zhen Lei Longyin Wen and Stan Z Li. 2014. The fastest deformable part model for object detection. In CVPR. 2497--2504. Junjie Yan Zhen Lei Longyin Wen and Stan Z Li. 2014. The fastest deformable part model for object detection. In CVPR. 2497--2504."},{"key":"e_1_3_2_1_35_1","volume-title":"Is Faster R-CNN doing well for pedestrian detection?","author":"Zhang Liliang","unstructured":"Liliang Zhang , Liang Lin , Xiaodan Liang , and Kaiming He. 2016. Is Faster R-CNN doing well for pedestrian detection? . In ECCV. Springer , 443--457. Liliang Zhang, Liang Lin, Xiaodan Liang, and Kaiming He. 2016. Is Faster R-CNN doing well for pedestrian detection?. In ECCV. Springer, 443--457."},{"key":"e_1_3_2_1_36_1","volume-title":"Citypersons: A diverse dataset for pedestrian detection. In CVPR. 3213--3221.","author":"Zhang Shanshan","year":"2017","unstructured":"Shanshan Zhang , Rodrigo Benenson , and Bernt Schiele . 2017 . Citypersons: A diverse dataset for pedestrian detection. In CVPR. 3213--3221. Shanshan Zhang, Rodrigo Benenson, and Bernt Schiele. 2017. Citypersons: A diverse dataset for pedestrian detection. In CVPR. 3213--3221."},{"key":"e_1_3_2_1_37_1","first-page":"4","article-title":"Filtered channel features for pedestrian detection","volume":"1","author":"Zhang Shanshan","year":"2015","unstructured":"Shanshan Zhang , Rodrigo Benenson , Bernt Schiele , 2015 . Filtered channel features for pedestrian detection .. In CVPR , Vol. 1. 4 . Shanshan Zhang, Rodrigo Benenson, Bernt Schiele, et al. 2015. Filtered channel features for pedestrian detection.. In CVPR, Vol. 1. 4.","journal-title":"CVPR"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Shifeng Zhang Longyin Wen Xiao Bian Zhen Lei and Stan Z Li. 2018. Occlusion-aware R-CNN: detecting pedestrians in a crowd. In ECCV. 637--653. Shifeng Zhang Longyin Wen Xiao Bian Zhen Lei and Stan Z Li. 2018. Occlusion-aware R-CNN: detecting pedestrians in a crowd. In ECCV. 637--653.","DOI":"10.1007\/978-3-030-01219-9_39"},{"key":"e_1_3_2_1_39_1","volume-title":"SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection. arXiv preprint arXiv:1902.09080","author":"Zhou Chengju","year":"2019","unstructured":"Chengju Zhou , Meiqing Wu , and Siew-Kei Lam . 2019. SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection. arXiv preprint arXiv:1902.09080 ( 2019 ). Chengju Zhou, Meiqing Wu, and Siew-Kei Lam. 2019. SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection. arXiv preprint arXiv:1902.09080 (2019)."}],"event":{"name":"ICMR '21: International Conference on Multimedia Retrieval","location":"Taipei Taiwan","acronym":"ICMR '21","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 2021 International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3460426.3463588","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3460426.3463588","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:49:22Z","timestamp":1750193362000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3460426.3463588"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,24]]},"references-count":39,"alternative-id":["10.1145\/3460426.3463588","10.1145\/3460426"],"URL":"https:\/\/doi.org\/10.1145\/3460426.3463588","relation":{},"subject":[],"published":{"date-parts":[[2021,8,24]]},"assertion":[{"value":"2021-09-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}