{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:38:14Z","timestamp":1760240294288,"version":"build-2065373602"},"reference-count":50,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2019,4,29]],"date-time":"2019-04-29T00:00:00Z","timestamp":1556496000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Artificial Intelligence Development and Innovation Project of Shanghai","award":["2018-RGZN-01013"],"award-info":[{"award-number":["2018-RGZN-01013"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Crowd counting, which is widely used in disaster management, traffic monitoring, and other fields of urban security, is a challenging task that is attracting increasing interest from researchers. For better accuracy, most methods have attempted to handle the scale variation explicitly. which results in huge scale changes of the object size. However, earlier methods based on convolutional neural networks (CNN) have focused primarily on improving accuracy while ignoring the complexity of the model. This paper proposes a novel method based on a lightweight CNN-based network for estimating crowd counting and generating density maps under resource constraints. The network is composed of three components: a basic feature extractor (BFE), a stacked \u00e0 trous convolution module (SACM), and a context fusion module (CFM). The BFE encodes basic feature information with reduced spatial resolution for further refining. Various pieces of contextual information are generated through a short pipeline in SACM. To generate a context fusion density map, CFM distills feature maps from the above components. The whole network is trained in an end-to-end fashion and uses a compression factor to restrict its size. Experiments on three highly-challenging datasets demonstrate that the proposed method delivers attractive performance.<\/jats:p>","DOI":"10.3390\/s19092013","type":"journal-article","created":{"date-parts":[[2019,4,29]],"date-time":"2019-04-29T07:01:22Z","timestamp":1556521282000},"page":"2013","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Design and Analysis of a Lightweight Context Fusion CNN Scheme for Crowd Counting"],"prefix":"10.3390","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0394-4635","authenticated-orcid":false,"given":"Yang","family":"Yu","sequence":"first","affiliation":[{"name":"College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201418, China"}]},{"given":"Jifeng","family":"Huang","sequence":"additional","affiliation":[{"name":"College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201418, China"}]},{"given":"Wen","family":"Du","sequence":"additional","affiliation":[{"name":"DS Information Technology Co., Ltd., Shanghai 200032, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0394-4635","authenticated-orcid":false,"given":"Naixue","family":"Xiong","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Northeastern State University, Tahlequah, OK 74464, USA"}]}],"member":"1968","published-online":{"date-parts":[[2019,4,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1049","DOI":"10.1109\/TIP.2017.2740160","article-title":"Body Structure Aware Deep Crowd Counting","volume":"27","author":"Huang","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1016\/j.jvcir.2016.03.021","article-title":"Dense Crowd Counting from Still Images with Convolutional Neural Networks","volume":"38","author":"Hu","year":"2016","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Idrees, H., Saleemi, I., Seibert, C., and Shah, M. (2013, January 23\u201328). Multi-source Multi-scale Counting in Extremely Dense Crowd Images. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.329"},{"key":"ref_4","unstructured":"Sindagi, V.A., and Patel, V.M. (September, January 29). CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting. Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance, Lecce, Italy."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.imavis.2017.01.010","article-title":"Going deeper into action recognition: A survey","volume":"60","author":"Herath","year":"2017","journal-title":"Image Vis. Comput."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1442","DOI":"10.1109\/TPAMI.2013.230","article-title":"Visual Tracking: An Experimental Survey","volume":"36","author":"Smeulders","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1109\/TCSVT.2014.2358029","article-title":"Crowded Scene Analysis: A Survey","volume":"25","author":"Li","year":"2015","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Zhou, D., Chen, S., Gao, S., and Ma, Y. (2016, January 27\u201330). Single-Image Crowd Counting via Multi-Column Convolutional Neural Network. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.70"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sindagi, V.A., and Patel, V.M. (2017, January 22\u201329). Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.206"},{"key":"ref_10","unstructured":"Kang, D., and Chan, A. (2018, January 3\u20136). Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid. Proceedings of the British Machine Vision Conference (BMVC), Newcastle upon Tyne, UK."},{"key":"ref_11","unstructured":"O\u00f1oro, D., and J L\u00f3pez-Sastre, R. (2016). Towards Perspective-Free Object Counting with Deep Learning. European Conference on Computer Vision, Springer."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Sam, D.B., Surya, S., and Babu, R.V. (2017, January 21\u201326). Switching Convolutional Neural Network for Crowd Counting. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.429"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, X., and Chen, D. (2018, January 18\u201323). CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00120"},{"key":"ref_14","unstructured":"Cong, Z., Hongsheng, L., Wang, X., and Xiaokang, Y. (2015, January 7\u201312). Cross-scene crowd counting via deep convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA."},{"key":"ref_15","unstructured":"Chen, C.L., Chen, K., Gong, S., and Xiang, T. (2013). Crowd Counting and Profiling: Methodology and Evaluation. Modeling, Simulation and Visual Analysis of Crowds: A Multidisciplinary Perspective, Springer."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1109\/TPAMI.2011.155","article-title":"Pedestrian Detection: An Evaluation of the State of the Art","volume":"34","author":"Dollar","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Li, M., Zhang, Z., Huang, K., and Tan, T. (2008, January 8\u201311). Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection. Proceedings of the 2008 19th International Conference on Pattern Recognition, Tampa, FL, USA.","DOI":"10.1109\/ICPR.2008.4761705"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Liu, S., Zhai, S., Li, C., and Tang, J. (2017, January 14\u201317). An effective approach to crowd counting with CNN-based statistical features. Proceedings of the 2017 International Smart Cities Conference (ISC2), Wuxi, China.","DOI":"10.1109\/ISC2.2017.8090827"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2188","DOI":"10.1109\/TPAMI.2011.70","article-title":"Hough Forests for Object Detection, Tracking, and Action Recognition","volume":"33","author":"Gall","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1023\/B:VISI.0000013087.49260.fb","article-title":"Robust Real-Time Face Detection","volume":"57","author":"Viola","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_21","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_23","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_24","unstructured":"Chan, A.B., and Vasconcelos, N. (October, January 29). Bayesian Poisson regression for crowd counting. Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Japan."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ryan, D., Denman, S., Fookes, C., and Sridharan, S. (2009, January 1\u20133). Crowd Counting Using Multiple Local Features. Proceedings of the 2009 Digital Image Computing: Techniques and Applications, Melbourne, VIC, Australia.","DOI":"10.1109\/DICTA.2009.22"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Chen, K., Loy, C.C., Gong, S., and Xiang, T. (2012, January 3\u20137). Feature Mining for Localised Crowd Counting. Proceedings of the British Machine Vision Conference, Surrey, UK.","DOI":"10.5244\/C.26.21"},{"key":"ref_27","unstructured":"Lempitsky, V.S., and Zisserman, A. (2010, January 6\u20139). Learning To Count Objects in Images. Proceedings of the 23rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Pham, V., Kozakaya, T., Yamaguchi, O., and Okada, R. (2015, January 7\u201313). COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.372"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wang, Y., and Zou, Y. (2016, January 25\u201328). Fast visual object counting via example-based density estimation. Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533041"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Xu, B., and Qiu, G. (2016, January 7\u201310). Crowd density estimation based on rich features and random projection forest. Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA.","DOI":"10.1109\/WACV.2016.7477682"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wang, C., Zhang, H., Yang, L., Liu, S., and Cao, X. (2015, January 26\u201330). Deep People Counting in Extremely Dense Crowds. Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia.","DOI":"10.1145\/2733373.2806337"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Walach, E., and Wolf, L. (2016). Learning to Count with CNN Boosting. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46475-6_41"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Shang, C., Ai, H., and Bai, B. (2016, January 25\u201328). End-to-end crowd counting via joint learning local and global count. Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7532551"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Boominathan, L., Kruthiventi, S.S.S., and Babu, R.V. (2016, January 15\u201319). CrowdNet: A Deep Convolutional Network for Dense Crowd Counting. Proceedings of the 24th ACM International Conference on Multimedia, Amsterdam, The Netherlands.","DOI":"10.1145\/2964284.2967300"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1007\/s00138-018-0955-6","article-title":"Mixture of counting CNNs","volume":"29","author":"Kumagai","year":"2018","journal-title":"Mach. Vis. Appl."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sam, D.B., Sajjan, N.N., Babu, R.V., and Srinivasan, M. (2018, January 18\u201323). Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00381"},{"key":"ref_37","unstructured":"Marsden, M., McGuiness, K., Little, S., and O\u2019Connor, N. (March, January 27). Fully Convolutional Crowd Counting On Highly Congested Scenes. Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP), Porto, Portugal."},{"key":"ref_38","unstructured":"Leibe, B., Matas, J., Sebe, N., and Welling, M. (2016). Crossing-Line Crowd Counting with Two-Phase Deep Neural Networks. European Conference on Computer Vision, Springer."},{"key":"ref_39","unstructured":"Marsden, M., McGuinness, K., Little, S., and Connor, N.E.O. (September, January 29). ResnetCrowd: A residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification. Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance, Lecce, Italy."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Li, J., Yang, H., Chen, L., Li, J., and Zhi, C. (2017, January 7\u20139). An end-to-end generative adversarial network for crowd counting under complicated scenes. Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, Cagliari, Italy.","DOI":"10.1109\/BMSB.2017.7986133"},{"key":"ref_41","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Shen, Z., Xu, Y., Ni, B., Wang, M., Hu, J., and Yang, X. (2018, January 18\u201323). Crowd Counting via Adversarial Cross-Scale Consistency Pursuit. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00550"},{"key":"ref_43","unstructured":"Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., and Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50\u00d7 fewer parameters and <0.5 MB model size. arXiv."},{"key":"ref_44","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201323). ShuffleNet: An Extremely Efficient Convolutional Neural Network forMobile Devices. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Gholami, A., Kwon, K., Wu, B., Tai, Z., Yue, X., Jin, P., Zhao, S., and Keutzer, K. (2018, January 18\u201322). SqueezeNext: Hardware-Aware Neural Network Design. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00215"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018, January 12\u201315). Understanding Convolution for Semantic Segmentation. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00163"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_50","unstructured":"Mart\u00edn, A., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2016). TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/9\/2013\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:47:59Z","timestamp":1760186879000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/9\/2013"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,29]]},"references-count":50,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2019,5]]}},"alternative-id":["s19092013"],"URL":"https:\/\/doi.org\/10.3390\/s19092013","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2019,4,29]]}}}