{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T06:36:57Z","timestamp":1761633417788,"version":"build-2065373602"},"reference-count":44,"publisher":"Institution of Engineering and Technology (IET)","issue":"10","license":[{"start":{"date-parts":[[2021,4,1]],"date-time":"2021-04-01T00:00:00Z","timestamp":1617235200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003453","name":"Natural Science Foundation of Guangdong Province","doi-asserted-by":"publisher","award":["No. 2016A030313288"],"award-info":[{"award-number":["No. 2016A030313288"]}],"id":[{"id":"10.13039\/501100003453","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["ietresearch.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["IET Image Processing"],"published-print":{"date-parts":[[2021,8]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Large\u2010scale variations may cause a serious problem in crowd counting. In recent years, most methods for this problem use convolutional neural networks with a fixed scale for encoding and decoding image features. The scale of the convolutional layer is usually manually adjusted and may have to deal with image features on unfitted scales. In this paper, a method called scale\u2010aware convolutional neural network(SCNet) is proposed, which adds a scale selection mechanism to the dilated convolutional operation. Shared weight multi\u2010branch is used to deal with features on different scales, and an attention mechanism is introduced to determine the weights of the branches that fit the scale. Experimental results demonstrate that the proposed SCNet outperforms most existing\u00a0methods.<\/jats:p>","DOI":"10.1049\/ipr2.12187","type":"journal-article","created":{"date-parts":[[2021,4,1]],"date-time":"2021-04-01T07:19:42Z","timestamp":1617261582000},"page":"2192-2201","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Improving crowd counting with scale\u2010aware convolutional neural network"],"prefix":"10.1049","volume":"15","author":[{"given":"Qingge","family":"Ji","sequence":"first","affiliation":[{"name":"School of Data and computer Science Sun Yat\u2010sen University  Guangzhou China"},{"name":"Guangdong Province Key Laboratory of Big Data Analysis and Processing  Guangzhou China"}]},{"given":"Hang","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Data and computer Science Sun Yat\u2010sen University  Guangzhou China"},{"name":"Guangdong Province Key Laboratory of Big Data Analysis and Processing  Guangzhou China"}]},{"given":"Di","family":"Bao","sequence":"additional","affiliation":[{"name":"School of Data and computer Science Sun Yat\u2010sen University  Guangzhou China"},{"name":"Guangdong Province Key Laboratory of Big Data Analysis and Processing  Guangzhou China"}]}],"member":"265","published-online":{"date-parts":[[2021,4]]},"reference":[{"key":"e_1_2_7_2_1","doi-asserted-by":"crossref","unstructured":"Zhang Y. et\u00a0al.:Single\u2010image crowd counting via multi\u2010column convolutional neural network. In:Proceedings of the IEEE Conference on computer Vision and Pattern Recognition pp.589\u2013597.IEEE Piscataway NJ(2016)","DOI":"10.1109\/CVPR.2016.70"},{"key":"e_1_2_7_3_1","doi-asserted-by":"crossref","unstructured":"Cao X. et\u00a0al.:Scale aggregation network for accurate and efficient crowd counting. In:Proceedings of the European Conference on Computer Vision pp.734\u2013750.Springer Berlin (2018)","DOI":"10.1007\/978-3-030-01228-1_45"},{"key":"e_1_2_7_4_1","doi-asserted-by":"crossref","unstructured":"Li Y. Zhang X. Chen D.:Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.1091\u20131100.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPR.2018.00120"},{"key":"e_1_2_7_5_1","doi-asserted-by":"crossref","unstructured":"Idrees H. et\u00a0al.:Multi\u2010source multi\u2010scale counting in extremely dense crowd images. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.2547\u20132554.IEEE Piscataway NJ(2013)","DOI":"10.1109\/CVPR.2013.329"},{"key":"e_1_2_7_6_1","doi-asserted-by":"crossref","unstructured":"Li M. et\u00a0al.:Estimating the number of people in crowded scenes by mid based foreground segmentation and head\u2010shoulder detection. In:2008 19th International Conference on Pattern Recognition pp.1\u20134.IEEE Piscataway NJ(2009)","DOI":"10.1109\/ICPR.2008.4761705"},{"key":"e_1_2_7_7_1","doi-asserted-by":"crossref","unstructured":"Idrees H. et\u00a0al.:Composition loss for counting density map estimation and localization in dense crowds. In:Proceedings of the European Conference on Computer Vision (ECCV) pp.532\u2013546.Springer Berlin (2018)","DOI":"10.1007\/978-3-030-01216-8_33"},{"key":"e_1_2_7_8_1","doi-asserted-by":"crossref","unstructured":"Onoro\u2010Rubio D. L\u00f3pez\u2010Sastre R.J.:Towards perspective\u2010free object counting with deep learning. In:European Conference on Computer Vision pp.615\u2013629.Springer Berlin(2016)","DOI":"10.1007\/978-3-319-46478-7_38"},{"key":"e_1_2_7_9_1","doi-asserted-by":"crossref","unstructured":"Boominathan L. Kruthiventi S.S. Babu R.V.:Crowdnet: A deep convolutional network for dense crowd counting. In:Proceedings of the 24th ACM international conference on Multimedia pp.640\u2013644.ACM Press New York NY(2016)","DOI":"10.1145\/2964284.2967300"},{"key":"e_1_2_7_10_1","doi-asserted-by":"crossref","unstructured":"Sam D.B. Surya S. Babu R.V.:Switching convolutional neural network for crowd counting. In:2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp.4031\u20134039.IEEE Piscataway NJ(2017)","DOI":"10.1109\/CVPR.2017.429"},{"key":"e_1_2_7_11_1","doi-asserted-by":"crossref","unstructured":"Sindagi V.A. Patel V.M.:Generating high\u2010quality crowd density maps using contextual pyramid cnns. In:Proceedings of the IEEE International Conference on Computer Vision pp.1861\u20131870.IEEE Piscataway NJ(2017)","DOI":"10.1109\/ICCV.2017.206"},{"key":"e_1_2_7_12_1","doi-asserted-by":"crossref","unstructured":"Zeng L. et\u00a0al.:Multi\u2010scale convolutional neural networks for crowd counting. In:2017 IEEE International Conference on Image Processing (ICIP) pp.465\u2013469.IEEE Piscataway NJ(2017)","DOI":"10.1109\/ICIP.2017.8296324"},{"key":"e_1_2_7_13_1","doi-asserted-by":"crossref","unstructured":"Szegedy C. et\u00a0al.:Rethinking the inception architecture for computer vision. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.2818\u20132826.IEEE Piscataway NJ(2016)","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_2_7_14_1","unstructured":"Krizhevsky A. Sutskever I. Hinton G.E.:Imagenet classification with deep convolutional neural networks. In:Advances in Neural Information Processing Systems pp.1097\u20131105.MIT Press Cambridge MA(2012)"},{"key":"e_1_2_7_15_1","unstructured":"Simonyan K. Zisserman A.:Very deep convolutional networks for large\u2010scale image recognition.arXiv:14091556(2014)"},{"key":"e_1_2_7_16_1","doi-asserted-by":"crossref","unstructured":"Szegedy C. et\u00a0al.:Going deeper with convolutions. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.1\u20139.IEEE Piscataway NJ(2015)","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_7_17_1","doi-asserted-by":"crossref","unstructured":"He K. et\u00a0al.:Deep residual learning for image recognition. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.770\u2013778.IEEE Piscataway NJ(2016)","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_7_18_1","doi-asserted-by":"crossref","unstructured":"Fu J. Zheng H. Mei T.:Look closer to see better: Recurrent attention convolutional neural network for fine\u2010grained image recognition. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.4438\u20134446.IEEE Piscataway NJ(2017)","DOI":"10.1109\/CVPR.2017.476"},{"key":"e_1_2_7_19_1","doi-asserted-by":"crossref","unstructured":"Zheng H. et\u00a0al.:Learning multi\u2010attention convolutional neural network for fine\u2010grained image recognition. In:Proceedings of the IEEE International Conference on Computer Vision pp.5209\u20135217.IEEE Piscataway NJ(2017)","DOI":"10.1109\/ICCV.2017.557"},{"key":"e_1_2_7_20_1","doi-asserted-by":"crossref","unstructured":"Ronneberger O. Fischer P. Brox T.:U\u2010net: Convolutional networks for biomedical image segmentation. In:International Conference on Medical Image Computing and Computer\u2010Assisted Intervention pp.234\u2013241.Springer Berlin(2015)","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_7_21_1","doi-asserted-by":"crossref","unstructured":"Long J. Shelhamer E. Darrell T.:Fully convolutional networks for semantic segmentation. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.3431\u20133440.IEEE Piscataway NJ(2015)","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_2_7_22_1","unstructured":"Ren S. et\u00a0al.:Faster r\u2010cnn: Towards real\u2010time object detection with region proposal networks. In:Advances in Neural Information Processing Systems pp.91\u201399.MIT Press Cambridge MA(2015)"},{"key":"e_1_2_7_23_1","doi-asserted-by":"crossref","unstructured":"Redmon J. et\u00a0al.:You only look once: Unified real\u2010time object detection. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.779\u2013788.IEEE Piscataway NJ(2016)","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_2_7_24_1","doi-asserted-by":"crossref","unstructured":"Li W. et\u00a0al.:Deepreid: Deep filter pairing neural network for person re\u2010identification. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.152\u2013159.IEEE Piscataway NJ(2014)","DOI":"10.1109\/CVPR.2014.27"},{"key":"e_1_2_7_25_1","doi-asserted-by":"crossref","unstructured":"Ahmed E. Jones M. Marks T.K.:An improved deep learning architecture for person re\u2010identification. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.3908\u20133916.IEEE Piscataway NJ(2015)","DOI":"10.1109\/CVPR.2015.7299016"},{"key":"e_1_2_7_26_1","doi-asserted-by":"crossref","unstructured":"Liu J. et\u00a0al.:Decidenet: Counting varying density crowds through attention guided detection and density estimation. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.5197\u20135206.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPR.2018.00545"},{"key":"e_1_2_7_27_1","unstructured":"Kang D. Chan A.:Crowd counting by adaptively fusing predictions from an image pyramid.arXiv:180506115(2018)"},{"key":"e_1_2_7_28_1","doi-asserted-by":"crossref","unstructured":"Lin G. et\u00a0al.:Refinenet: Multi\u2010path refinement networks for high\u2010resolution semantic segmentation. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.1925\u20131934.IEEE Piscataway NJ(2017)","DOI":"10.1109\/CVPR.2017.549"},{"key":"e_1_2_7_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_2_7_30_1","doi-asserted-by":"crossref","unstructured":"Deb D. Ventura J.:An aggregated multicolumn dilated convolution network for perspective\u2010free counting. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops pp.195\u2013204.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPRW.2018.00057"},{"key":"e_1_2_7_31_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018868"},{"key":"e_1_2_7_32_1","doi-asserted-by":"crossref","unstructured":"Liu X. van deWeijer J. Bagdanov A.D.:Leveraging unlabeled data for crowd counting by learning to rank. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.7661\u20137669.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPR.2018.00799"},{"key":"e_1_2_7_33_1","doi-asserted-by":"crossref","unstructured":"Walach E. Wolf L.:Learning to count with cnn boosting. In:European conference on computer vision pp.660\u2013676.Springer Berlin(2016)","DOI":"10.1007\/978-3-319-46475-6_41"},{"key":"e_1_2_7_34_1","doi-asserted-by":"crossref","unstructured":"Ranjan V. Le H. Hoai M.:Iterative crowd counting. In:Proceedings of the European Conference on Computer Vision pp.270\u2013285.Springer Berlin(2018)","DOI":"10.1007\/978-3-030-01234-2_17"},{"key":"e_1_2_7_35_1","doi-asserted-by":"crossref","unstructured":"Liu L. et\u00a0al.:Crowd counting using deep recurrent spatial\u2010aware network. arXiv:180700601 (2018)","DOI":"10.24963\/ijcai.2018\/118"},{"key":"e_1_2_7_36_1","first-page":"2017","volume-title":"Advances in neural information processing systems","author":"Jaderberg M.","year":"2015"},{"key":"e_1_2_7_37_1","doi-asserted-by":"crossref","unstructured":"Sindagi V.A. Patel V.M.:Cnn\u2010based cascaded multi\u2010task learning of high\u2010level prior and density estimation for crowd counting. In:2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) pp.1\u20136.IEEE Piscataway NJ(2017)","DOI":"10.1109\/AVSS.2017.8078491"},{"key":"e_1_2_7_38_1","doi-asserted-by":"crossref","unstructured":"Shi Z. et\u00a0al.:Crowd counting with deep negative correlation learning. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.5382\u20135390.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPR.2018.00564"},{"key":"e_1_2_7_39_1","doi-asserted-by":"crossref","unstructured":"Babu\u2010Sam D. et\u00a0al.:Divide and grow: Capturing huge diversity in crowd images with incrementally growing cnn. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.3618\u20133626.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPR.2018.00381"},{"key":"e_1_2_7_40_1","doi-asserted-by":"crossref","unstructured":"Zagoruyko S. Komodakis N.:Learning to compare image patches via convolutional neural networks. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.4353\u20134361.IEEE Piscataway NJ(2015)","DOI":"10.1109\/CVPR.2015.7299064"},{"key":"e_1_2_7_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2019.2943010"},{"key":"e_1_2_7_42_1","doi-asserted-by":"crossref","unstructured":"Shen Z. et\u00a0al.:Crowd counting via adversarial cross\u2010scale consistency pursuit. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.5245\u20135254.IEEE Piscataway NJ(2018)","DOI":"10.1109\/CVPR.2018.00550"},{"key":"e_1_2_7_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2899939"},{"key":"e_1_2_7_44_1","unstructured":"Zhang C. et\u00a0al.:Cross\u2010scene crowd counting via deep convolutional neural networks. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.833\u2013841.IEEE Piscataway NJ(2015)"},{"key":"e_1_2_7_45_1","doi-asserted-by":"crossref","unstructured":"Chan A.B. Liang Z.S.J. Vasconcelos N.:Privacy preserving crowd monitoring: Counting people without people models or tracking. In:2008 IEEE Conference on Computer Vision and Pattern Recognition pp.1\u20137.IEEE Piscataway NJ(2008)","DOI":"10.1109\/CVPR.2008.4587569"}],"container-title":["IET Image Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1049\/ipr2.12187","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1049\/ipr2.12187","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ietresearch.onlinelibrary.wiley.com\/doi\/pdf\/10.1049\/ipr2.12187","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T06:32:07Z","timestamp":1761633127000},"score":1,"resource":{"primary":{"URL":"https:\/\/ietresearch.onlinelibrary.wiley.com\/doi\/10.1049\/ipr2.12187"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4]]},"references-count":44,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2021,8]]}},"alternative-id":["10.1049\/ipr2.12187"],"URL":"https:\/\/doi.org\/10.1049\/ipr2.12187","archive":["Portico"],"relation":{},"ISSN":["1751-9659","1751-9667"],"issn-type":[{"type":"print","value":"1751-9659"},{"type":"electronic","value":"1751-9667"}],"subject":[],"published":{"date-parts":[[2021,4]]},"assertion":[{"value":"2019-12-05","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-02","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-01","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}