{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T21:07:10Z","timestamp":1761599230900,"version":"3.41.0"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2022,3,4]],"date-time":"2022-03-04T00:00:00Z","timestamp":1646352000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004663","name":"Ministry of Science and Technology (MOST) of Taiwan","doi-asserted-by":"crossref","award":["MOST-109-2223-E-009-002-MY3, MOST-110-2634-F-007-015, MOST-110-2634-F-009-021, MOST-109-2218-E-002-015, MOST-109-2221-E-009-114-MY3, MOST-110-2218-E-A49-018, MOST-109-2221-E-009-097 and MOST-109-2221-E-001-015"],"award-info":[{"award-number":["MOST-109-2223-E-009-002-MY3, MOST-110-2634-F-007-015, MOST-110-2634-F-009-021, MOST-109-2218-E-002-015, MOST-109-2221-E-009-114-MY3, MOST-110-2218-E-A49-018, MOST-109-2221-E-009-097 and MOST-109-2221-E-001-015"]}],"id":[{"id":"10.13039\/501100004663","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2022,8,31]]},"abstract":"<jats:p>A recent line of research focuses on crowd density estimation from RGB images for a variety of applications, for example, surveillance and traffic flow control. The performance drops dramatically for low-quality images, such as occlusion, or poor light conditions. However, people are equipped with various wireless devices, allowing the received signals to be easily collected at the base station. As such, another line of research utilizes received signals for crowd counting. Nevertheless, received signals offer only information regarding the number of people, while an accurate density map cannot be derived. As unmanned aerial vehicles (UAVs) are now treated as flying base stations and equipped with cameras, we make the first attempt to leverage both RGB images and received signals for crowd density estimation on UAVs. Specifically, we propose a novel network to effectively fuse the RGB images and received signal strength (RSS) information. Moreover, we design a new loss function that considers the uncertainty from RSS and makes the prediction consistent with the received signals. Experimental results show that the proposed method successfully helps break the limit of traditional crowd density estimation methods and achieves state-of-the-art performance. The proposed dataset is released as a public download for future research.<\/jats:p>","DOI":"10.1145\/3492346","type":"journal-article","created":{"date-parts":[[2022,3,4]],"date-time":"2022-03-04T10:26:32Z","timestamp":1646389592000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Improving Crowd Density Estimation by Fusing Aerial Images and Radio Signals"],"prefix":"10.1145","volume":"18","author":[{"given":"Kai-Wei","family":"Yang","sequence":"first","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Yen-Yun","family":"Huang","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Jen-Wei","family":"Huang","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Ya-Rou","family":"Hsu","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Chang-Lin","family":"Wan","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Hong-Han","family":"Shuai","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Li-Chun","family":"Wang","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University, Hsinchu, Taiwan"}]},{"given":"Wen-Huang","family":"Cheng","sequence":"additional","affiliation":[{"name":"National Yang Ming Chiao Tung University and National Chung Hsing University, Hsinchu, Taiwan"}]}],"member":"320","published-online":{"date-parts":[[2022,3,4]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2014.2328098"},{"key":"e_1_3_2_3_2","unstructured":"Anas Basalamah. 2016. Automatic Update of Crowd and Traffic Data Using Device Monitoring. (Jul 2016). US Patent 9 401 086."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1147\/sj.41.0025"},{"key":"e_1_3_2_5_2","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918), Munich, Germany","author":"Cao Xinkun","year":"2020","unstructured":"Xinkun Cao, Zhipeng Wang, Yanyun Zhao, and Fei Su. 2020. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV\u201918), Munich, Germany. Springer, 757\u2013773."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2011.2172800"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2019.11.064"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00625"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350898"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.177"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2935651.2935657"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.155"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2655040"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2019.08.018"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350881"},{"key":"e_1_3_2_16_2","first-page":"315","volume-title":"Proceedings of the International Conference on Extending Database Technology\/International Conference on Database Theory (EDBT\/ICDT\u201914), Joint Conference, Athens, Greece","author":"Handte Marcus","year":"2014","unstructured":"Marcus Handte, Muhammad Umer Iqbal, Stephan Wagner, Wolfgang Apolinarski, Pedro Marr\u00f3n, Eva Maria Mu\u00f1oz Navarro, Santiago Martinez, Sara Izquierdo Barthelemy, and Mario G. Fern\u00e1ndez. 2014. Crowd density estimation for public transport vehicles. In Proceedings of the International Conference on Extending Database Technology\/International Conference on Database Theory (EDBT\/ICDT\u201914), Joint Conference, Athens, Greece. CEUR-WS.org, 315\u2013322."},{"key":"e_1_3_2_17_2","first-page":"346","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","author":"He Kaiming","year":"2014","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In Proceedings of the European Conference on Computer Vision (ECCV\u201914), Zurich, Switzerland. Springer, 346\u2013361.","journal-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201914), Zurich, Switzerland"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvcir.2016.03.021"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.329"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2396051"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00476"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2019.2915069"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2016.7477685"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/GLOBECOM38437.2019.9014210"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413602"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2014.2358029"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00120"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00192"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2019.2954747"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/JURSE.2019.8809046"},{"key":"e_1_3_2_31_2","unstructured":"Weizhe Liu Krzysztof Maciej Lis Mathieu Salzmann and Pascal Fua. 2019. Geometric and physical constraints for drone-based head plane crowd density estimation. In Proceedings of IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS\u201919) Macau SAR China . IEEE 244\u2013249."},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00524"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58586-0_15"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58555-6_15"},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","unstructured":"Jonathan Long Evan Shelhamer and Trevor Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201915) Boston MA USA . IEEE Computer Society 3431\u20133440.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3050059"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.328"},{"key":"e_1_3_2_38_2","first-page":"79","article-title":"Measuring the accuracy of crowd counting using WiFi probe-request-frame counting technique","volume":"8","author":"Ooi Yaik","year":"2016","unstructured":"Yaik Ooi, Kong Zan Wai, Ian Tan, and Ooi Boon Sheng. 2016. Measuring the accuracy of crowd counting using WiFi probe-request-frame counting technique. Journal of Telecommunication, Electronic and Computer Engineering 8, 2 (2016), 79\u201381.","journal-title":"Journal of Telecommunication, Electronic and Computer Engineering"},{"key":"e_1_3_2_39_2","unstructured":"Xingang Pan Jianping Shi Ping Luo Xiaogang Wang and Xiaoou Tang. 2017. Spatial As Deep: Spatial CNN for Traffic Scene Understanding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI\u201918) New Orleans Louisiana USA . AAAI Press 7276\u20137283."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/DICTA.2009.22"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.429"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00745"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICAIIC.2019.8669071"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58621-8_13"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413611"},{"key":"e_1_3_2_46_2","article-title":"End-to-end people detection in crowded scenes","author":"Stewart Russell","year":"2016","unstructured":"Russell Stewart, Mykhaylo Andriluka, and Andrew Yan-Tak Ng. 2016. End-to-end people detection in crowded scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916), Las Vegas, NV, USA. IEEE Computer Society, 2325\u20132333.","journal-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916), Las Vegas, NV, USA"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-55615-4"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350914"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2952083"},{"key":"e_1_3_2_50_2","first-page":"10009","volume-title":"IEEE Internet of Things Journal","author":"Wang Haijun","year":"2019","unstructured":"Haijun Wang, Haitao Zhao, Weiyu Wu, Jun Xiong, Dongtang Ma, and Jibo Wei. 2019. Deployment algorithms of flying base stations: 5G and beyond with UAVs. In IEEE Internet of Things Journal 6, 6 (2019), 10009\u201310027."},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00839"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3338533.3366687"},{"key":"e_1_3_2_53_2","article-title":"Human-like traffic scene understanding system: A survey","author":"Xia Zi-Xiang","year":"2020","unstructured":"Zi-Xiang Xia, Wei-Cheng Lai, Li-Wu Tsao, Lien-Feng Hsu, Chih-Chia Hu Yu, Hong-Han Shuai, and Wen-Huang Cheng. 2020. Human-like traffic scene understanding system: A survey. IEEE Industrial Electronics Magazine 15, 1 (2020), 6\u201315.","journal-title":"IEEE Industrial Electronics Magazine"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/MWC.2018.1700393"},{"issue":"1","key":"e_1_3_2_55_2","article-title":"Multi-scale supervised attentive encoder-decoder network for crowd counting","volume":"16","author":"Zhang Anran","year":"2020","unstructured":"Anran Zhang, Xiaolong Jiang, and Xianbin Cao Baochang Zhang. 2020. Multi-scale supervised attentive encoder-decoder network for crowd counting. ACM Transactions on Multimedia Computing, Communications, and Applications Article 28, 16, 1 (April 2020).","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications Article 28,"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-33039-2_4"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.70"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.03.083"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11276-020-02274-7"},{"key":"e_1_3_2_60_2","unstructured":"Pengfei Zhu Longyin Wen Dawei Du Xiao Bian Qinghua Hu and Haibin Ling. 2020. Vision Meets Drones: Past Present and Future. (2020). arxiv:2001.06303"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3492346","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3492346","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:09Z","timestamp":1750188669000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3492346"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,4]]},"references-count":59,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,8,31]]}},"alternative-id":["10.1145\/3492346"],"URL":"https:\/\/doi.org\/10.1145\/3492346","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2022,3,4]]},"assertion":[{"value":"2021-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-03-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}