{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T08:49:23Z","timestamp":1771663763120,"version":"3.50.1"},"reference-count":84,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T00:00:00Z","timestamp":1689120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2021ZD0111902"],"award-info":[{"award-number":["2021ZD0111902"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072015, U21B2038, U19B2039, and 61902053"],"award-info":[{"award-number":["62072015, U21B2038, U19B2039, and 61902053"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Beijing Natural Science Foundation","award":["4222021"],"award-info":[{"award-number":["4222021"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>Weakly supervised crowd counting involves the regression of the number of individuals present in an image, using only the total number as the label. However, this task is plagued by two primary challenges: the large variation of head size and uneven distribution of crowd density. To address these issues, we propose a novel Hypergraph Association Crowd Counting (HACC) framework. Our approach consists of a new multi-scale dilated pyramid module that can efficiently handle the large variation of head size. Further, we propose a novel hypergraph association module to solve the problem of uneven distribution of crowd density by encoding higher-order associations among features, which opens a new direction to solve this problem. Experimental results on multiple datasets demonstrate that our HACC model achieves new state-of-the-art results.<\/jats:p>","DOI":"10.1145\/3594670","type":"journal-article","created":{"date-parts":[[2023,4,26]],"date-time":"2023-04-26T11:46:25Z","timestamp":1682509585000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["Hypergraph Association Weakly Supervised Crowd Counting"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0608-1502","authenticated-orcid":false,"given":"Bo","family":"Li","sequence":"first","affiliation":[{"name":"Beijing University of Technology"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6650-6790","authenticated-orcid":false,"given":"Yong","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing University of Technology"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3658-5779","authenticated-orcid":false,"given":"Chengyang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing University of Technology"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3774-5789","authenticated-orcid":false,"given":"Xinglin","family":"Piao","sequence":"additional","affiliation":[{"name":"Beijing University of Technology"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8125-4648","authenticated-orcid":false,"given":"Baocai","family":"Yin","sequence":"additional","affiliation":[{"name":"Beijing University of Technology"}]}],"member":"320","published-online":{"date-parts":[[2023,7,12]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"872","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Abousamra Shahira","year":"2021","unstructured":"Shahira Abousamra, Minh Hoai, Dimitris Samaras, and Chao Chen. 2021. Localization in the crowd with topological constraints. In Proceedings of the AAAI Conference on Artificial Intelligence. 872\u2013881."},{"key":"e_1_3_1_3_2","volume-title":"Proceedings of Advances in Neural Information Processing Systems","volume":"29","author":"Atwood James","year":"2016","unstructured":"James Atwood and Don Towsley. 2016. Diffusion-convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, Vol. 29."},{"key":"e_1_3_1_4_2","first-page":"5744","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Sam Deepak Babu","year":"2017","unstructured":"Deepak Babu Sam, Shiv Surya, and R. Venkatesh Babu. 2017. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 5744\u20135752."},{"issue":"6","key":"e_1_3_1_5_2","doi-asserted-by":"crossref","first-page":"1784","DOI":"10.1109\/TITS.2017.2741507","article-title":"Intelligent vehicle counting and classification sensor for real-time traffic surveillance","volume":"19","author":"Balid Walid","year":"2017","unstructured":"Walid Balid, Hasan Tafish, and Hazem H. Refai. 2017. Intelligent vehicle counting and classification sensor for real-time traffic surveillance. IEEE Transactions on Intelligent Transportation Systems 19, 6 (2017), 1784\u20131794.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_1_6_2","first-page":"640","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Boominathan Lokesh","year":"2016","unstructured":"Lokesh Boominathan, Srinivas S. S. Kruthiventi, and R. Venkatesh Babu. 2016. CrowdNet: A deep convolutional network for dense crowd counting. In Proceedings of the ACM International Conference on Multimedia. 640\u2013644."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01228-1_45"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01228-1_45"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_3_1_10_2","article-title":"Rethinking atrous convolution for semantic image segmentation","author":"Chen Liang-Chieh","year":"2017","unstructured":"Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).","journal-title":"arXiv preprint arXiv:1706.05587"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.04.117"},{"key":"e_1_3_1_12_2","article-title":"Reinforcing local feature representation for weakly-supervised dense crowd counting","author":"Chen Xiaoshuang","year":"2022","unstructured":"Xiaoshuang Chen and Hongtao Lu. 2022. Reinforcing local feature representation for weakly-supervised dense crowd counting. arXiv preprint arXiv:2202.10681 (2022).","journal-title":"arXiv preprint arXiv:2202.10681"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3055631"},{"key":"e_1_3_1_14_2","volume-title":"Proceedings of Advances in Neural Information Processing Systems","volume":"29","author":"Defferrard Micha\u00ebl","year":"2016","unstructured":"Micha\u00ebl Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of Advances in Neural Information Processing Systems, Vol. 29."},{"key":"e_1_3_1_15_2","article-title":"An image is worth 16x16 words: Transformers for image recognition at scale","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, et\u00a0al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).","journal-title":"arXiv preprint arXiv:2010.11929"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-018-0261-2"},{"key":"e_1_3_1_17_2","first-page":"4013","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Fan Qi","year":"2020","unstructured":"Qi Fan, Wei Zhuo, Chi-Keung Tang, and Yu-Wing Tai. 2020. Few-shot object detection with attention-RPN and multi-relation detector. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4013\u20134022."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2019.115664"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33013558"},{"key":"e_1_3_1_20_2","article-title":"Congested crowd instance localization with dilated convolutional Swin transformer","author":"Gao Junyu","year":"2021","unstructured":"Junyu Gao, Maoguo Gong, and Xuelong Li. 2021. Congested crowd instance localization with dilated convolutional Swin transformer. arXiv preprint arXiv:2108.00584 (2021).","journal-title":"arXiv preprint arXiv:2108.00584"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2019.08.018"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3052930"},{"key":"e_1_3_1_23_2","first-page":"9543","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Guo Dongyan","year":"2021","unstructured":"Dongyan Guo, Yanyan Shao, Ying Cui, Zhenhua Wang, Liyan Zhang, and Chunhua Shen. 2021. Graph attention tracking. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 9543\u20139552."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2020.101892"},{"key":"e_1_3_1_25_2","first-page":"7132","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Hu Jie","year":"2018","unstructured":"Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 7132\u20137141."},{"key":"e_1_3_1_26_2","first-page":"2547","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Idrees Haroon","year":"2013","unstructured":"Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2547\u20132554."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01216-8_33"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/366"},{"key":"e_1_3_1_29_2","article-title":"Adam: A method for stochastic optimization","author":"Kingma Diederik P.","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"e_1_3_1_30_2","article-title":"Semi-supervised classification with graph convolutional networks","author":"Kipf Thomas N.","year":"2016","unstructured":"Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).","journal-title":"arXiv preprint arXiv:1609.02907"},{"key":"e_1_3_1_31_2","first-page":"547","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Laradji Issam H.","year":"2018","unstructured":"Issam H. Laradji, Negar Rostamzadeh, Pedro O. Pinheiro, David Vazquez, and Mark Schmidt. 2018. Where are the blobs: Counting by localization with point supervision. In Proceedings of the European Conference on Computer Vision. 547\u2013562."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107616"},{"key":"e_1_3_1_33_2","volume-title":"Proceedings of Advances in Neural Information Processing Systems","volume":"23","author":"Lempitsky Victor","year":"2010","unstructured":"Victor Lempitsky and Andrew Zisserman. 2010. Learning to count objects in images. In Proceedings of Advances in Neural Information Processing Systems, Vol. 23."},{"key":"e_1_3_1_34_2","first-page":"1","article-title":"CCST: Crowd counting with swin transformer","author":"Li Bo","year":"2022","unstructured":"Bo Li, Yong Zhang, Haihui Xu, and Baocai Yin. 2022. CCST: Crowd counting with swin transformer. Visual Computer 2022 (2022), 1\u201312.","journal-title":"Visual Computer"},{"key":"e_1_3_1_35_2","volume-title":"Proceedings of the International Conference on Artificial Life and Robotics","author":"Li Wang","year":"2020","unstructured":"Wang Li, Huailin Zhao, Zhen Nie, and Yaoyao Li. 2020. Graph-based global reasoning network for crowd counting. In Proceedings of the International Conference on Artificial Life and Robotics."},{"key":"e_1_3_1_36_2","first-page":"6054","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Li Yanghao","year":"2019","unstructured":"Yanghao Li, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Scale-aware trident networks for object detection. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 6054\u20136063."},{"key":"e_1_3_1_37_2","first-page":"1091","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Li Yuhong","year":"2018","unstructured":"Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1091\u20131100."},{"issue":"6","key":"e_1_3_1_38_2","first-page":"1","article-title":"TransCrowd: Weakly-supervised crowd counting with transformers","volume":"65","author":"Liang Dingkang","year":"2022","unstructured":"Dingkang Liang, Xiwu Chen, Wei Xu, Yu Zhou, and Xiang Bai. 2022. TransCrowd: Weakly-supervised crowd counting with transformers. Science China Information Sciences 65, 6 (2022), 1\u201314.","journal-title":"Science China Information Sciences"},{"key":"e_1_3_1_39_2","first-page":"38","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Liang Dingkang","year":"2022","unstructured":"Dingkang Liang, Wei Xu, and Xiang Bai. 2022. An end-to-end transformer model for crowd localization. In Proceedings of the European Conference on Computer Vision. 38\u201354."},{"key":"e_1_3_1_40_2","article-title":"Focal inverse distance transform maps for crowd localization","author":"Liang Dingkang","year":"2022","unstructured":"Dingkang Liang, Wei Xu, Yingying Zhu, and Yu Zhou. 2022. Focal inverse distance transform maps for crowd localization. IEEE Transactions on Multimedia. Early access, September 2, 2022.","journal-title":"IEEE Transactions on Multimedia."},{"key":"e_1_3_1_41_2","first-page":"19628","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Lin Hui","year":"2022","unstructured":"Hui Lin, Zhiheng Ma, Rongrong Ji, Yaowei Wang, and Xiaopeng Hong. 2022. Boosting crowd counting via multifaceted attention. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 19628\u201319637."},{"key":"e_1_3_1_42_2","first-page":"2117","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin, Piotr Doll\u00e1r, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2117\u20132125."},{"issue":"10","key":"e_1_3_1_43_2","first-page":"3513","article-title":"Counting objects by blockwise classification","volume":"30","author":"Liu Liang","year":"2019","unstructured":"Liang Liu, Hao Lu, Haipeng Xiong, Ke Xian, Zhiguo Cao, and Chunhua Shen. 2019. Counting objects by blockwise classification. IEEE Transactions on Circuits and Systems for Video Technology 30, 10 (2019), 3513\u20133527.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"e_1_3_1_44_2","first-page":"1774","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Lingbo","year":"2019","unstructured":"Lingbo Liu, Zhilin Qiu, Guanbin Li, Shufan Liu, Wanli Ouyang, and Liang Lin. 2019. Crowd counting with deep structured scale integration network. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1774\u20131783."},{"key":"e_1_3_1_45_2","first-page":"385","volume-title":"Proceedings of the European Conference on Computer Vision","year":"2018","unstructured":"Songtao Liu and Di Huang. 2018. Receptive field block net for accurate and fast object detection. In Proceedings of the European Conference on Computer Vision. 385\u2013400."},{"key":"e_1_3_1_46_2","first-page":"5099","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Liu Weizhe","year":"2019","unstructured":"Weizhe Liu, Mathieu Salzmann, and Pascal Fua. 2019. Context-aware crowd counting. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 5099\u20135108."},{"key":"e_1_3_1_47_2","article-title":"Global attention mechanism: Retain information to enhance channel-spatial interactions","author":"Liu Yichao","year":"2021","unstructured":"Yichao Liu, Zongru Shao, and Nico Hoffmann. 2021. Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv preprint arXiv:2112.05561 (2021).","journal-title":"arXiv preprint arXiv:2112.05561"},{"key":"e_1_3_1_48_2","first-page":"6469","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Yuting","year":"2019","unstructured":"Yuting Liu, Miaojing Shi, Qijun Zhao, and Xiaofang Wang. 2019. Point in, box out: Beyond counting persons in crowds. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 6469\u20136478."},{"key":"e_1_3_1_49_2","first-page":"10012","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Ze","year":"2021","unstructured":"Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10012\u201310022."},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6837"},{"key":"e_1_3_1_51_2","first-page":"11693","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Luo Ao","year":"2020","unstructured":"Ao Luo, Fan Yang, Xin Li, Dong Nie, Zhicheng Jiao, Shangchen Zhou, and Hong Cheng. 2020. Hybrid graph neural networks for crowd counting. In Proceedings of the AAAI Conference on Artificial Intelligence. 11693\u201311700."},{"key":"e_1_3_1_52_2","first-page":"1","article-title":"Hyperspectral image classification using feature fusion hypergraph convolution neural network","volume":"60","author":"Ma Zhongtian","year":"2021","unstructured":"Zhongtian Ma, Zhiguo Jiang, and Haopeng Zhang. 2021. Hyperspectral image classification using feature fusion hypergraph convolution neural network. IEEE Transactions on Geoscience and Remote Sensing 60 (2021), 1\u201314.","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"key":"e_1_3_1_53_2","first-page":"6142","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Ma Zhiheng","year":"2019","unstructured":"Zhiheng Ma, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2019. Bayesian loss for crowd count estimation with point supervision. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 6142\u20136151."},{"key":"e_1_3_1_54_2","first-page":"615","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Onoro-Rubio Daniel","year":"2016","unstructured":"Daniel Onoro-Rubio and Roberto J. L\u00f3pez-Sastre. 2016. Towards perspective-free object counting with deep learning. In Proceedings of the European Conference on Computer Vision. 615\u2013629."},{"key":"e_1_3_1_55_2","first-page":"3618","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Sam Deepak Babu","year":"2018","unstructured":"Deepak Babu Sam, Neeraj N. Sajjan, R. Venkatesh Babu, and Mukundhan Srinivasan. 2018. Divide and grow: Capturing huge diversity in crowd images with incrementally growing CNN. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 3618\u20133626."},{"key":"e_1_3_1_56_2","article-title":"CrowdFormer: Weakly-supervised crowd counting with improved generalizability","author":"Savner Siddharth Singh","year":"2022","unstructured":"Siddharth Singh Savner and Vivek Kanhangad. 2022. CrowdFormer: Weakly-supervised crowd counting with improved generalizability. arXiv preprint arXiv:2203.03768 (2022).","journal-title":"arXiv preprint arXiv:2203.03768"},{"key":"e_1_3_1_57_2","first-page":"19618","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Shu Weibo","year":"2022","unstructured":"Weibo Shu, Jia Wan, Kay Chen Tan, Sam Kwong, and Antoni B. Chan. 2022. Crowd counting in the frequency domain. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 19618\u201319627."},{"key":"e_1_3_1_58_2","article-title":"JHU-CROWD++: Large-scale crowd counting dataset and a benchmark method","author":"Sindagi Vishwanath","year":"2022","unstructured":"Vishwanath Sindagi, Rajeev Yasarla, and Vishal M. M. Patel. 2022. JHU-CROWD++: Large-scale crowd counting dataset and a benchmark method. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 6 (2022), 2594\u20132609.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_59_2","first-page":"1","volume-title":"Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance","author":"Sindagi Vishwanath A.","year":"2017","unstructured":"Vishwanath A. Sindagi and Vishal M. Patel. 2017. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. 1\u20136."},{"key":"e_1_3_1_60_2","first-page":"1861","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Sindagi Vishwanath A.","year":"2017","unstructured":"Vishwanath A. Sindagi and Vishal M. Patel. 2017. Generating high-quality crowd density maps using contextual pyramid CNNs. In Proceedings of the IEEE International Conference on Computer Vision. 1861\u20131870."},{"key":"e_1_3_1_61_2","first-page":"1002","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Sindagi Vishwanath A.","year":"2019","unstructured":"Vishwanath A. Sindagi and Vishal M. Patel. 2019. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1002\u20131012."},{"key":"e_1_3_1_62_2","first-page":"1221","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Sindagi Vishwanath A.","year":"2019","unstructured":"Vishwanath A. Sindagi, Rajeev Yasarla, and Vishal M. Patel. 2019. Pushing the frontiers of unconstrained crowd counting: New dataset and benchmark method. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1221\u20131231."},{"key":"e_1_3_1_63_2","first-page":"2576","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Song Qingyu","year":"2021","unstructured":"Qingyu Song, Changan Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Jian Wu, and Jiayi Ma. 2021. To choose or to fuse? Scale selection for crowd counting. In Proceedings of the AAAI Conference on Artificial Intelligence. 2576\u20132583."},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_1_65_2","article-title":"Multi-level attentive convolutional neural network for crowd counting","author":"Tian Mengxiao","year":"2021","unstructured":"Mengxiao Tian, Hao Guo, and Chengjiang Long. 2021. Multi-level attentive convolutional neural network for crowd counting. arXiv preprint arXiv:2105.11422 (2021).","journal-title":"arXiv preprint arXiv:2105.11422"},{"key":"e_1_3_1_66_2","article-title":"CCTrans: Simplifying and improving crowd counting with transformer","author":"Tian Ye","year":"2021","unstructured":"Ye Tian, Xiangxiang Chu, and Hongpeng Wang. 2021. CCTrans: Simplifying and improving crowd counting with transformer. arXiv preprint arXiv:2109.14483 (2021).","journal-title":"arXiv preprint arXiv:2109.14483"},{"key":"e_1_3_1_67_2","first-page":"1974","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Wan Jia","year":"2021","unstructured":"Jia Wan, Ziquan Liu, and Antoni B. Chan. 2021. A generalized loss function for crowd counting and localization. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1974\u20131983."},{"key":"e_1_3_1_68_2","first-page":"1299","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Wang Chuan","year":"2015","unstructured":"Chuan Wang, Hua Zhang, Liang Yang, Si Liu, and Xiaochun Cao. 2015. Deep people counting in extremely dense crowds. In Proceedings of the ACM International Conference on Multimedia. 1299\u20131302."},{"key":"e_1_3_1_69_2","article-title":"Joint CNN and transformer network via weakly supervised learning for efficient crowd counting","author":"Wang Fusen","year":"2022","unstructured":"Fusen Wang, Kai Liu, Fei Long, Nong Sang, Xiaofeng Xia, and Jun Sang. 2022. Joint CNN and transformer network via weakly supervised learning for efficient crowd counting. arXiv preprint arXiv:2203.06388 (2022).","journal-title":"arXiv preprint arXiv:2203.06388"},{"key":"e_1_3_1_70_2","first-page":"167","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Wang Mingjie","year":"2023","unstructured":"Mingjie Wang, Hao Cai, Yong Dai, and Minglun Gong. 2023. Dynamic mixture of counter network for location-agnostic crowd counting. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision. 167\u2013177."},{"key":"e_1_3_1_71_2","article-title":"CrowdMLP: Weakly-supervised crowd counting via multi-granularity MLP","author":"Wang Mingjie","year":"2022","unstructured":"Mingjie Wang, Jun Zhou, Hao Cai, and Minglun Gong. 2022. CrowdMLP: Weakly-supervised crowd counting via multi-granularity MLP. arXiv preprint arXiv:2203.08219 (2022).","journal-title":"arXiv preprint arXiv:2203.08219"},{"key":"e_1_3_1_72_2","first-page":"8198","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Wang Qi","year":"2019","unstructured":"Qi Wang, Junyu Gao, Wei Lin, and Yuan Yuan. 2019. Learning from synthetic data for crowd counting in the wild. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 8198\u20138207."},{"key":"e_1_3_1_73_2","first-page":"399","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Wang Xiaolong","year":"2018","unstructured":"Xiaolong Wang and Abhinav Gupta. 2018. Videos as space-time region graphs. In Proceedings of the European Conference on Computer Vision. 399\u2013417."},{"key":"e_1_3_1_74_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"e_1_3_1_75_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58577-8_20"},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-021-01542-z"},{"key":"e_1_3_1_77_2","first-page":"3244","volume-title":"Proceedings of the International Conference on Pattern Recognition","author":"Yang Jianxing","year":"2018","unstructured":"Jianxing Yang, Yuan Zhou, and Sun-Yuan Kung. 2018. Multi-scale generative adversarial networks for crowd counting. In Proceedings of the International Conference on Pattern Recognition. 3244\u20133249."},{"key":"e_1_3_1_78_2","first-page":"1","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Yang Yifan","year":"2020","unstructured":"Yifan Yang, Guorong Li, Zhe Wu, Li Su, Qingming Huang, and Nicu Sebe. 2020. Weakly-supervised crowd counting learns from sorting rather than locations. In Proceedings of the European Conference on Computer Vision. 1\u201317."},{"key":"e_1_3_1_79_2","first-page":"465","volume-title":"Proceedings of the IEEE International Conference on Image Processing","author":"Zeng Lingke","year":"2017","unstructured":"Lingke Zeng, Xiangmin Xu, Bolun Cai, Suo Qiu, and Tong Zhang. 2017. Multi-scale convolutional neural networks for crowd counting. In Proceedings of the IEEE International Conference on Image Processing. 465\u2013469."},{"key":"e_1_3_1_80_2","article-title":"Co-communication graph convolutional network for multi-view crowd counting","author":"Zhai Qiang","year":"2022","unstructured":"Qiang Zhai, Fan Yang, Xin Li, Guo-Sen Xie, Hong Cheng, and Zicheng Liu. 2022. Co-communication graph convolutional network for multi-view crowd counting. IEEE Transactions on Multimedia. Early access, August 17, 2022.","journal-title":"IEEE Transactions on Multimedia."},{"issue":"1","key":"e_1_3_1_81_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3356019","article-title":"Multi-scale supervised attentive encoder-decoder network for crowd counting","volume":"16","author":"Zhang Anran","year":"2020","unstructured":"Anran Zhang, Xiaolong Jiang, Baochang Zhang, and Xianbin Cao. 2020. Multi-scale supervised attentive encoder-decoder network for crowd counting. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s (2020), 1\u201320.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_82_2","first-page":"589","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Zhang Yingying","year":"2016","unstructured":"Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 589\u2013597."},{"key":"e_1_3_1_83_2","first-page":"6881","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Zheng Sixiao","year":"2021","unstructured":"Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, et\u00a0al. 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 6881\u20136890."},{"key":"e_1_3_1_84_2","article-title":"Dual path multi-scale fusion networks with attention for crowd counting","author":"Zhu Liang","year":"2019","unstructured":"Liang Zhu, Zhijian Zhao, Chao Lu, Yining Lin, Yao Peng, and Tangren Yao. 2019. Dual path multi-scale fusion networks with attention for crowd counting. arXiv preprint arXiv:1902.01115 (2019).","journal-title":"arXiv preprint arXiv:1902.01115"},{"key":"e_1_3_1_85_2","doi-asserted-by":"publisher","DOI":"10.1016\/J.NEUCOM.2019.08.009"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3594670","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3594670","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:08Z","timestamp":1750183748000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3594670"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,12]]},"references-count":84,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3594670"],"URL":"https:\/\/doi.org\/10.1145\/3594670","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,12]]},"assertion":[{"value":"2022-12-12","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-20","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}