{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,16]],"date-time":"2026-03-16T20:38:21Z","timestamp":1773693501095,"version":"3.50.1"},"reference-count":23,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2020,5,2]],"date-time":"2020-05-02T00:00:00Z","timestamp":1588377600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100005790","name":"Thammasat University","doi-asserted-by":"publisher","award":["2\/23\/2559"],"award-info":[{"award-number":["2\/23\/2559"]}],"id":[{"id":"10.13039\/501100005790","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Crowd counting is a challenging task dealing with the variation of an object scale and a crowd density. Existing works have emphasized on skip connections by integrating shallower layers with deeper layers, where each layer extracts features in a different object scale and crowd density. However, only high-level features are emphasized while ignoring low-level features. This paper proposes an estimation network by passing high-level features to shallow layers and emphasizing its low-level feature. Since an estimation network is a hierarchical network, a high-level feature is also emphasized by an improved low-level feature. Our estimation network consists of two identical networks for extracting a high-level feature and estimating the final result. To preserve semantic information, dilated convolution is employed without resizing the feature map. Our method was tested in three datasets for counting humans and vehicles in a crowd image. The counting performance is evaluated by mean absolute error and root mean squared error indicating the accuracy and robustness of an estimation network, respectively. The experimental result shows that our network outperforms other related works in a high crowd density and is effective for reducing over-counting error in the overall case.<\/jats:p>","DOI":"10.3390\/jimaging6050028","type":"journal-article","created":{"date-parts":[[2020,5,4]],"date-time":"2020-05-04T03:29:39Z","timestamp":1588562979000},"page":"28","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Redesigned Skip-Network for Crowd Counting with Dilated Convolution and Backward Connection"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8301-6053","authenticated-orcid":false,"given":"Sorn","family":"Sooksatra","sequence":"first","affiliation":[{"name":"School of Information and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani 12120, Thailand"},{"name":"School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa 923-1211, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Toshiaki","family":"Kondo","sequence":"additional","affiliation":[{"name":"School of Information and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani 12120, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pished","family":"Bunnun","sequence":"additional","affiliation":[{"name":"National Electronic and Computer Technology Center, National Science and Technology Development Agency, Pathum Thani 12120, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Atsuo","family":"Yoshitaka","sequence":"additional","affiliation":[{"name":"School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa 923-1211, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,5,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Onoro-Rubio, D., and L\u00f3pez-Sastre, R.J. (2016, January 11\u201314). Towards perspective-free object counting with deep learning. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46478-7_38"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sam, D.B., Surya, S., and Babu, R.V. (2017, January 21\u201326). Switching convolutional neural network for crowd counting. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.429"},{"key":"ref_3","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_4","unstructured":"Zhang, Y., Zhou, D., Chen, S., Gao, S., and Ma, Y. (July, January 26). Single-image crowd counting via multi-column convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Surya, S., and Babu, R.V. (2016, January 18\u201322). TraCount: A deep convolutional neural network for highly overlapping vehicle counting. Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing, Guwahati, India.","DOI":"10.1145\/3009977.3010060"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Boominathan, L., Kruthiventi, S.S., and Babu, R.V. (2016, January 15\u201319). Crowdnet: A deep convolutional network for dense crowd counting. Proceedings of the 24th ACM International Conference on Multimedia, Amsterdam, The Netherlands.","DOI":"10.1145\/2964284.2967300"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Kumagai, S., Hotta, K., and Kurita, T. (2017). Mixture of counting cnns: Adaptive integration of cnns specialized to specific appearance for crowd counting. arXiv.","DOI":"10.1007\/s00138-018-0955-6"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Eigen, D., and Fergus, R. (2015, January 7\u201313). Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.304"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.patrec.2017.07.007","article-title":"A survey of recent advances in cnn-based single image crowd counting and density estimation","volume":"107","author":"Sindagi","year":"2018","journal-title":"Pattern Recognit. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3360","DOI":"10.1007\/s10489-018-1150-1","article-title":"Skip-connection convolutional neural network for still image crowd counting","volume":"48","author":"Wang","year":"2018","journal-title":"Appl. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Marsden, M., McGuinness, K., Little, S., and O\u2019Connor, N.E. (2016). Fully convolutional crowd counting on highly congested scenes. arXiv.","DOI":"10.5220\/0006097300270033"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Sooksatra, S., Yoshitaka, A., Kondo, T., and Bunnun, P. (2019, January 26\u201329). The Density-Aware Estimation Network for Vehicle Counting in Traffic Surveillance System. Proceedings of the 15th International Conference on Signal-Image Technology & Internet-Based Systems, SITIS 2019, Sorrento-Naples, Italy.","DOI":"10.1109\/SITIS.2019.00047"},{"key":"ref_13","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, X., and Chen, D. (2018, January 18\u201322). Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00120"},{"key":"ref_15","unstructured":"Gao, G., Gao, J., Liu, Q., Wang, Q., and Wang, Y. (2020). CNN-based Density Estimation and Crowd Counting: A Survey. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Idrees, H., Saleemi, I., Seibert, C., and Shah, M. (2013, January 23\u201328). Multi-source multi-scale counting in extremely dense crowd images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.329"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Guerrero-G\u00f3mez-Olmedo, R., Torre-Jim\u00e9nez, B., L\u00f3pez-Sastre, R., Maldonado-Basc\u00f3n, S., and Onoro-Rubio, D. (2015, January 17\u201319). Extremely overlapping vehicle counting. Proceedings of the Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), Santiago de Compostela, Spain.","DOI":"10.1007\/978-3-319-19390-8_48"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sindagi, V.A., and Patel, V.M. (2017, January 22\u201329). Generating high-quality crowd density maps using contextual pyramid cnns. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.206"},{"key":"ref_19","unstructured":"Zou, Z., Liu, Y., Xu, S., Wei, W., Wen, S., and Zhou, P. (2020). Crowd Counting via Hierarchical Scale Recalibration Network. arXiv."},{"key":"ref_20","unstructured":"Huang, S., Li, X., Cheng, Z.Q., Zhang, Z., and Hauptmann, A. (2018). Stacked Pooling for Boosting Scale Invariance of Crowd Counting. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Shi, X., Li, X., Wu, C., Kong, S., Yang, J., and He, L. (2020). A Real-Time Deep Network for Crowd Counting. arXiv.","DOI":"10.1109\/ICASSP40776.2020.9053780"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Rodriguez, M., Laptev, I., Sivic, J., and Audibert, J.Y. (2011, January 6\u201313). Density-aware person detection and tracking in crowds. Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126526"},{"key":"ref_23","unstructured":"Lempitsky, V., and Zisserman, A. (2010, January 6\u20139). Learning to count objects in images. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/6\/5\/28\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:52:24Z","timestamp":1760363544000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/6\/5\/28"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,2]]},"references-count":23,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2020,5]]}},"alternative-id":["jimaging6050028"],"URL":"https:\/\/doi.org\/10.3390\/jimaging6050028","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,2]]}}}