{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T21:44:11Z","timestamp":1775511851033,"version":"3.50.1"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2024,1,22]],"date-time":"2024-01-22T00:00:00Z","timestamp":1705881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2021ZD0111902"],"award-info":[{"award-number":["2021ZD0111902"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072015, U21B2038, U19B2039, and 61902053"],"award-info":[{"award-number":["62072015, U21B2038, U19B2039, and 61902053"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Beijing Natural Science Foundation","award":["4222021"],"award-info":[{"award-number":["4222021"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,5,31]]},"abstract":"<jats:p>Most existing weakly supervised crowd counting methods utilize Convolutional Neural Networks (CNN) or Transformer to estimate the total number of individuals in an image. However, both CNN-based (grid-to-count paradigm) and Transformer-based (sequence-to-count paradigm) methods take images as inputs in a regular form. This approach treats all pixels equally but cannot address the uneven distribution problem within human crowds. This challenge would lead to a decline in the counting performance of the model. Compared with grid and sequence, the graph structure could better explore the relationship among features. In this article, we propose a new graph-based crowd counting method named CrowdGraph, which reinterprets the weakly supervised crowd counting problem from a graph-to-count perspective. In the proposed CrowdGraph, each image is constructed as a graph, and a graph-based network is designed to extract features at the graph level. CrowdGraph comprises three main components: a dynamic graph convolutional backbone, a multi-scale dilated graph convolution module, and a regression head. To the best of our knowledge, CrowdGraph is the first method that is completely formulated based on the Graph Neural Network (GNN) for the crowd counting task. Extensive experiments demonstrate that the proposed CrowdGraph outperforms pure CNN-based and pure Transformer-based weakly supervised methods comprehensively and achieves highly competitive counting performance.<\/jats:p>","DOI":"10.1145\/3638774","type":"journal-article","created":{"date-parts":[[2023,12,27]],"date-time":"2023-12-27T22:11:01Z","timestamp":1703715061000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["CrowdGraph: Weakly supervised Crowd Counting via Pure Graph Neural Network"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3658-5779","authenticated-orcid":false,"given":"Chengyang","family":"Zhang","sequence":"first","affiliation":[{"name":"Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6650-6790","authenticated-orcid":false,"given":"Yong","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0608-1502","authenticated-orcid":false,"given":"Bo","family":"Li","sequence":"additional","affiliation":[{"name":"Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3774-5789","authenticated-orcid":false,"given":"Xinglin","family":"Piao","sequence":"additional","affiliation":[{"name":"Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8125-4648","authenticated-orcid":false,"given":"Baocai","family":"Yin","sequence":"additional","affiliation":[{"name":"Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Artificial Intelligence Institute, Faculty of Information Technology, Beijing University of Technology, China"}]}],"member":"320","published-online":{"date-parts":[[2024,1,22]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.107634"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2023.3329542"},{"key":"e_1_3_1_4_2","first-page":"4594","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Bai Shuai","year":"2020","unstructured":"Shuai Bai, Zhiqun He, Yu Qiao, Hanzhe Hu, Wei Wu, and Junjie Yan. 2020. Adaptive dilated network with self-correction supervision for counting. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4594\u20134603."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3171235"},{"issue":"10","key":"e_1_3_1_6_2","first-page":"3486","article-title":"Pcc net: Perspective crowd counting via spatial convolutional network","volume":"30","author":"Gao Junyu","year":"2019","unstructured":"Junyu Gao, Qi Wang, and Xuelong Li. 2019. Pcc net: Perspective crowd counting via spatial convolutional network. IEEE Trans. Circ. Syst. Vid. Technol. 30, 10 (2019), 3486\u20133498.","journal-title":"IEEE Trans. Circ. Syst. Vid. Technol."},{"key":"e_1_3_1_7_2","article-title":"CCTrans: Simplifying and improving crowd counting with transformer","author":"Tian Ye","year":"2021","unstructured":"Ye Tian, Xiangxiang Chu, and Hongpeng Wang. 2021. CCTrans: Simplifying and improving crowd counting with transformer. arXiv preprint arXiv:2109.14483 (2021).","journal-title":"arXiv preprint arXiv:2109.14483"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00120"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01901"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_1_11_2","article-title":"ChatTraffic: Text-to-traffic generation via diffusion model","author":"Zhang Chengyang","year":"2023","unstructured":"Chengyang Zhang, Yong Zhang, Qitan Shao, Bo Li, Yisheng Lv, Xinglin Piao, and Baocai Yin. 2023. ChatTraffic: Text-to-traffic generation via diffusion model. arXiv preprint arXiv:2311.16203 (2023).","journal-title":"arXiv preprint arXiv:2311.16203"},{"key":"e_1_3_1_12_2","first-page":"8780","article-title":"Diffusion models beat GANs on image synthesis","volume":"34","author":"Dhariwal Prafulla","year":"2021","unstructured":"Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat GANs on image synthesis. Adv. Neural Inf. Process. Syst. 34 (2021), 8780\u20138794.","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_3_1_13_2","article-title":"Diffuse-denoise-count: Accurate crowd-counting with diffusion models","author":"Ranasinghe Yasiru","year":"2023","unstructured":"Yasiru Ranasinghe, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, and Vishal M. Patel. 2023. Diffuse-denoise-count: Accurate crowd-counting with diffusion models. arXiv preprint arXiv:2303.12790 (2023).","journal-title":"arXiv preprint arXiv:2303.12790"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107616"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58598-3_1"},{"issue":"6","key":"e_1_3_1_16_2","doi-asserted-by":"crossref","first-page":"160104","DOI":"10.1007\/s11432-021-3445-y","article-title":"TransCrowd: Weakly supervised crowd counting with transformers","volume":"65","author":"Liang Dingkang","year":"2022","unstructured":"Dingkang Liang, Xiwu Chen, Wei Xu, Yu Zhou, and Xiang Bai. 2022. TransCrowd: Weakly supervised crowd counting with transformers. Sci. China Inf. Sci. 65, 6 (2022), 160104.","journal-title":"Sci. China Inf. Sci."},{"key":"e_1_3_1_17_2","article-title":"CrowdFormer: Weakly supervised crowd counting with improved generalizability","author":"Savner Siddharth Singh","year":"2022","unstructured":"Siddharth Singh Savner and Vivek Kanhangad. 2022. CrowdFormer: Weakly supervised crowd counting with improved generalizability. arXiv preprint arXiv:2203.03768 (2022).","journal-title":"arXiv preprint arXiv:2203.03768"},{"key":"e_1_3_1_18_2","first-page":"11693","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Luo Ao","year":"2020","unstructured":"Ao Luo, Fan Yang, Xin Li, Dong Nie, Zhicheng Jiao, Shangchen Zhou, and Hong Cheng. 2020. Hybrid graph neural networks for crowd counting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 11693\u201311700."},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.04.117"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","unstructured":"Qiang Zhai Fan Yang Xin Li Guo-Sen Xie Hong Cheng and Zicheng Liu. 2023. Co-communication graph convolutional network for multi-view crowd counting. In IEEE Transactions on Multimedia Vol. 25 5813\u20135825. DOI:10.1109\/TMM.2022.3199555","DOI":"10.1109\/TMM.2022.3199555"},{"key":"e_1_3_1_21_2","article-title":"Vision GNN: An image is worth graph of nodes","author":"Han Kai","year":"2022","unstructured":"Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, and Enhua Wu. 2022. Vision GNN: An image is worth graph of nodes. arXiv preprint arXiv:2206.00272 (2022).","journal-title":"arXiv preprint arXiv:2206.00272"},{"key":"e_1_3_1_22_2","first-page":"322","volume-title":"Proceedings of Medical Imaging 2023: Biomedical Applications in Molecular, Structural, and Functional Imaging","volume":"12468","author":"Hu Mingzhe","year":"2023","unstructured":"Mingzhe Hu, Jing Wang, Chih-Wei Chang, Tian Liu, and Xiaofeng Yang. 2023. End-to-end brain tumor detection using a graph-feature-based classifier. In Proceedings of Medical Imaging 2023: Biomedical Applications in Molecular, Structural, and Functional Imaging, Vol. 12468. 322\u2013327."},{"key":"e_1_3_1_23_2","first-page":"2722","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201920)","author":"Kong Xiyu","year":"2020","unstructured":"Xiyu Kong, Muming Zhao, Hao Zhou, and Chongyang Zhang. 2020. Weakly supervised crowd-wise attention for robust crowd counting. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201920). IEEE, 2722\u20132726."},{"key":"e_1_3_1_24_2","article-title":"Glance to count: Learning to rank with anchors for weakly supervised crowd counting","author":"Xiong Zheng","year":"2022","unstructured":"Zheng Xiong, Liangyu Chai, Wenxi Liu, Yongtuo Liu, Sucheng Ren, and Shengfeng He. 2022. Glance to count: Learning to rank with anchors for weakly supervised crowd counting. arXiv preprint arXiv:2205.14659 (2022).","journal-title":"arXiv preprint arXiv:2205.14659"},{"key":"e_1_3_1_25_2","article-title":"Reinforcing local feature representation for weakly supervised dense crowd counting","author":"Chen Xiaoshuang","year":"2022","unstructured":"Xiaoshuang Chen and Hongtao Lu. 2022. Reinforcing local feature representation for weakly supervised dense crowd counting. arXiv preprint arXiv:2202.10681 (2022).","journal-title":"arXiv preprint arXiv:2202.10681"},{"key":"e_1_3_1_26_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017).","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_3_1_27_2","unstructured":"Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly Jakob Uszkoreit Neil Houlsby. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)."},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-022-0313-5"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-021-00959-z"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00371-022-02485-3"},{"key":"e_1_3_1_31_2","article-title":"Joint CNN and transformer network via weakly supervised learning for efficient crowd counting","author":"Wang Fusen","year":"2022","unstructured":"Fusen Wang, Kai Liu, Fei Long, Nong Sang, Xiaofeng Xia, and Jun Sang. 2022. Joint CNN and transformer network via weakly supervised learning for efficient crowd counting. arXiv preprint arXiv:2203.06388 (2022).","journal-title":"arXiv preprint arXiv:2203.06388"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00371-023-02831-z"},{"key":"e_1_3_1_33_2","first-page":"611","volume-title":"Proceedings of the International Conference on Artificial Life and Robotics","volume":"25","author":"Li Wang","year":"2020","unstructured":"Wang Li, Zhao Huailin, Nie Zhen, and Li Yaoyao. 2020. Graph-based global reasoning network for crowd counting. In Proceedings of the International Conference on Artificial Life and Robotics, Vol. 25. 611\u2013615."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","unstructured":"Zhe Wu Xinfeng Zhang Geng Tian Yaowei Wang and Qingming Huang. 2023. Spatial-temporal graph network for video crowd counting. IEEE Transactions on Circuits and Systems for Video Technology 33 1 (2023) 228\u2013241. DOI:10.1109\/TCSVT.2022.3187194","DOI":"10.1109\/TCSVT.2022.3187194"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i10.17098"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00023"},{"key":"e_1_3_1_37_2","first-page":"2763","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Zhao Gangming","year":"2021","unstructured":"Gangming Zhao, Weifeng Ge, and Yizhou Yu. 2021. GraphFPN: Graph feature pyramid network for object detection. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 2763\u20132772."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00378"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-03243-2_194-1"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00936"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3326362"},{"key":"e_1_3_1_42_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Valsesia Diego","year":"2019","unstructured":"Diego Valsesia, Giulia Fracastoro, and Enrico Magli. 2019. Learning localized generative models for 3D point clouds via graph convolution. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"e_1_3_1_44_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2023.3244750","article-title":"PSRT: Pyramid shuffle-and-reshuffle transformer for multispectral and hyperspectral image fusion","volume":"61","author":"Deng Shang-Qi","year":"2023","unstructured":"Shang-Qi Deng, Liang-Jian Deng, Xiao Wu, Ran Ran, Danfeng Hong, and Gemine Vivone. 2023. PSRT: Pyramid shuffle-and-reshuffle transformer for multispectral and hyperspectral image fusion. IEEE Trans. Geosci. Rem. Sens. 61 (2023), 1\u201315.","journal-title":"IEEE Trans. Geosci. Rem. Sens."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-022-0274-8"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"issue":"5","key":"e_1_3_1_47_2","first-page":"2594","article-title":"JHU-CROWD++: Large-scale crowd counting dataset and a benchmark method","volume":"44","author":"Sindagi Vishwanath A.","year":"2020","unstructured":"Vishwanath A. Sindagi, Rajeev Yasarla, and Vishal M. Patel. 2020. JHU-CROWD++: Large-scale crowd counting dataset and a benchmark method. IEEE Trans. Pattern Anal. Mach. Intell. 44, 5 (2020), 2594\u20132609.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.70"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01216-8_33"},{"key":"e_1_3_1_50_2","first-page":"2547","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Idrees Haroon","year":"2013","unstructured":"Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2547\u20132554."},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2020.3013269"},{"key":"e_1_3_1_52_2","first-page":"1","volume-title":"Proceedings of the IEEE International Conference on Advanced Video and Signal-based Surveillance","author":"Sindagi Vishwanath A.","year":"2017","unstructured":"Vishwanath A. Sindagi and Vishal M. Patel. 2017. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the IEEE International Conference on Advanced Video and Signal-based Surveillance. 1\u20136."},{"key":"e_1_3_1_53_2","first-page":"5099","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Weizhe","year":"2019","unstructured":"Weizhe Liu, Mathieu Salzmann, and Pascal Fua. 2019. Context-aware crowd counting. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 5099\u20135108."},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01228-1_45"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00109"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00839"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00624"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-021-01542-z"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3055631"},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00201"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i2.16170"},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01900"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19769-7_3"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3594670"},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00025"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00186"},{"issue":"1","key":"e_1_3_1_67_2","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1007\/s44196-021-00016-x","article-title":"A deep-fusion network for crowd counting in high-density crowded scenes","volume":"14","author":"Khan Sultan Daud","year":"2021","unstructured":"Sultan Daud Khan, Yasir Salih, Basim Zafar, and Abdulfattah Noorwali. 2021. A deep-fusion network for crowd counting in high-density crowded scenes. Int. J. Computat. Intell. Syst. 14, 1 (2021), 168.","journal-title":"Int. J. Computat. Intell. Syst."},{"issue":"1","key":"e_1_3_1_68_2","first-page":"1","article-title":"Multi-scale supervised attentive encoder-decoder network for crowd counting","volume":"16","author":"Zhang Anran","year":"2020","unstructured":"Anran Zhang, Xiaolong Jiang, Baochang Zhang, and Xianbin Cao. 2020. Multi-scale supervised attentive encoder-decoder network for crowd counting. ACM Trans. Multim. Comput., Commun. Applic. 16, 1s (2020), 1\u201320.","journal-title":"ACM Trans. Multim. Comput., Commun. Applic."},{"issue":"4","key":"e_1_3_1_69_2","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.1007\/s13369-020-04990-w","article-title":"Sparse to dense scale prediction for crowd counting in high density crowds","volume":"46","author":"Khan Sultan Daud","year":"2021","unstructured":"Sultan Daud Khan and Saleh Basalamah. 2021. Sparse to dense scale prediction for crowd counting in high density crowds. Arab. J. Sci. Eng. 46, 4 (2021), 3051\u20133065.","journal-title":"Arab. J. Sci. Eng."},{"issue":"8","key":"e_1_3_1_70_2","doi-asserted-by":"crossref","first-page":"2127","DOI":"10.1007\/s00371-020-01974-7","article-title":"Scale and density invariant head detection deep model for crowd counting in pedestrian crowds","volume":"37","author":"Khan Sultan Daud","year":"2021","unstructured":"Sultan Daud Khan and Saleh Basalamah. 2021. Scale and density invariant head detection deep model for crowd counting in pedestrian crowds. Visual Comput. 37, 8 (2021), 2127\u20132137.","journal-title":"Visual Comput."},{"key":"e_1_3_1_71_2","first-page":"2576","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"Song Qingyu","year":"2021","unstructured":"Qingyu Song, Changan Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Jian Wu, and Jiayi Ma. 2021. To choose or to fuse? Scale selection for crowd counting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 2576\u20132583."},{"key":"e_1_3_1_72_2","doi-asserted-by":"publisher","unstructured":"Dingkang Liang Wei Xu Yingying Zhu and Yu Zhou. 2023. Focal inverse distance transform maps for crowd localization. In IEEE Transactions on Multimedia 25 (2023) 6040\u20136052. DOI:10.1109\/TMM.2022.3203870","DOI":"10.1109\/TMM.2022.3203870"},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00335"},{"key":"e_1_3_1_74_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2020.2978717"},{"key":"e_1_3_1_75_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvcir.2023.103853"},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2918650"},{"key":"e_1_3_1_77_2","first-page":"3225","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Ning","year":"2019","unstructured":"Ning Liu, Yongchao Long, Changqing Zou, Qun Niu, Li Pan, and Hefeng Wu. 2019. ADCrowdNet: An attention-injective deformable convolutional network for crowd understanding. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3225\u20133234."},{"key":"e_1_3_1_78_2","first-page":"4036","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wan Jia","year":"2019","unstructured":"Jia Wan, Wenhan Luo, Baoyuan Wu, Antoni B. Chan, and Wei Liu. 2019. Residual regression with semantic prior for crowd counting. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4036\u20134045."},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3146459"},{"key":"e_1_3_1_80_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2022.09.113"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3638774","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3638774","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:34Z","timestamp":1750291414000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3638774"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,22]]},"references-count":79,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,5,31]]}},"alternative-id":["10.1145\/3638774"],"URL":"https:\/\/doi.org\/10.1145\/3638774","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,22]]},"assertion":[{"value":"2023-07-18","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-23","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-01-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}