{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,22]],"date-time":"2026-03-22T02:07:27Z","timestamp":1774145247347,"version":"3.50.1"},"reference-count":39,"publisher":"MDPI AG","issue":"17","license":[{"start":{"date-parts":[[2021,8,30]],"date-time":"2021-08-30T00:00:00Z","timestamp":1630281600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In the field of computer vision, object detection consists of automatically finding objects in images by giving their positions. The most common fields of application are safety systems (pedestrian detection, identification of behavior) and control systems. Another important application is head\/person detection, which is the primary material for road safety, rescue, surveillance, etc. In this study, we developed a new approach based on two parallel Deeplapv3+ to improve the performance of the person detection system. For the implementation of our semantic segmentation model, a working methodology with two types of ground truths extracted from the bounding boxes given by the original ground truths was established. The approach has been implemented in our two private datasets as well as in a public dataset. To show the performance of the proposed system, a comparative analysis was carried out on two deep learning semantic segmentation state-of-art models: SegNet and U-Net. By achieving 99.14% of global accuracy, the result demonstrated that the developed strategy could be an efficient way to build a deep neural network model for semantic segmentation. This strategy can be used, not only for the detection of the human head but also be applied in several semantic segmentation applications.<\/jats:p>","DOI":"10.3390\/s21175848","type":"journal-article","created":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T22:58:15Z","timestamp":1630450695000},"page":"5848","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["New End-to-End Strategy Based on DeepLabv3+ Semantic Segmentation for Human Head Detection"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5529-8834","authenticated-orcid":false,"given":"Mohamed","family":"Chouai","sequence":"first","affiliation":[{"name":"Faculty of Electrical Engineering and Informatics, University of Pardubice, 532 10 Pardubice, Czech Republic"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7359-0764","authenticated-orcid":false,"given":"Petr","family":"Dolezel","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Informatics, University of Pardubice, 532 10 Pardubice, Czech Republic"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2324-162X","authenticated-orcid":false,"given":"Dominik","family":"Stursa","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Informatics, University of Pardubice, 532 10 Pardubice, Czech Republic"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6640-8995","authenticated-orcid":false,"given":"Zdenek","family":"Nemec","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Informatics, University of Pardubice, 532 10 Pardubice, Czech Republic"}]}],"member":"1968","published-online":{"date-parts":[[2021,8,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"111102","DOI":"10.1115\/1.4044256","article-title":"Learning to design from humans: Imitating human designers through deep learning","volume":"141","author":"Raina","year":"2019","journal-title":"J. Mech. Des."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Gong, V., Daamen, W., Bozzon, A., and Hoogendoorn, S. (2021). Counting people in the crowd using social media images for crowd management in city events. Transportation.","DOI":"10.1007\/s11116-020-10159-z"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Songchenchen, G., and Bourennane, E.B. (2018, January 16\u201318). Implementation of real time reconfigurable embedded architecture for people counting in a crowd area. Proceedings of the International Symposium on Modelling and Implementation of Complex Systems, Laghouat, Algeria.","DOI":"10.1007\/978-3-030-05481-6_17"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Bansal, A., and Venkatesh, K. (2015). People counting in high density crowds from still images. arXiv.","DOI":"10.17706\/IJCEE.2015.7.5.316-324"},{"key":"ref_5","unstructured":"Khan, S.D., Vizzari, G., Bandini, S., and Basalamah, S. (2014). Detecting Dominant Motion Flows and People Counting in High Density Crowds, University of West Bohemia in Pilsen."},{"key":"ref_6","unstructured":"Zou, Z., Shi, Z., Guo, Y., and Ye, J. (2019). Object detection in 20 years: A survey. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Pilarczyk, R., and Skarbek, W. (2019). On intra-class variance for deep learning of classifiers. arXiv.","DOI":"10.2478\/fcds-2019-0015"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Velastin, S.A., Fern\u00e1ndez, R., Espinosa, J.E., and Bay, A. (2020). Detecting, tracking and counting people getting on\/off a metropolitan train using a standard video camera. Sensors, 20.","DOI":"10.3390\/s20216251"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1007\/s12198-020-00211-5","article-title":"CH-Net: Deep adversarial autoencoders for semantic segmentation in X-ray images of cabin baggage screening at airports","volume":"13","author":"Chouai","year":"2020","journal-title":"J. Transp. Secur."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Sun, W., Gao, Z., Cui, J., Ramesh, B., Zhang, B., and Li, Z. (2021). Semantic Segmentation Leveraging Simultaneous Depth Estimation. Sensors, 21.","DOI":"10.3390\/s21030690"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"104169","DOI":"10.1016\/j.imavis.2021.104169","article-title":"Lightweight boundary refinement module based on point supervision for semantic segmentation","volume":"110","author":"Dong","year":"2021","journal-title":"Image Vis. Comput."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"104147","DOI":"10.1016\/j.imavis.2021.104147","article-title":"DeepSegment: Segmentation of motion capture data using deep convolutional neural network","volume":"109","author":"Yasin","year":"2021","journal-title":"Image Vis. Comput."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Weksler, S., Rozenstein, O., Haish, N., Moshelion, M., Wallach, R., and Ben-Dor, E. (2021). Detection of Potassium Deficiency and Momentary Transpiration Rate Estimation at Early Growth Stages Using Proximal Hyperspectral Imaging and Extreme Gradient Boosting. Sensors, 21.","DOI":"10.3390\/s21030958"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Guti\u00e9rrez, J., Rodr\u00edguez, V., and Martin, S. (2021). Comprehensive review of vision-based fall detection systems. Sensors, 21.","DOI":"10.3390\/s21030947"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ko, K., Jang, I., Choi, J.H., Lim, J.H., and Lee, D.U. (2021). Stochastic Decision Fusion of Convolutional Neural Networks for Tomato Ripeness Detection in Agricultural Sorting Systems. Sensors, 21.","DOI":"10.3390\/s21030917"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Xie, J., Stensrud, E., and Skramstad, T. (2021). Detection-Based Object Tracking Applied to Remote Ship Inspection. Sensors, 21.","DOI":"10.3390\/s21030761"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Gong, X., Le, Z., Wu, Y., and Wang, H. (2021). Real-Time Multiobject Tracking Based on Multiway Concurrency. Sensors, 21.","DOI":"10.3390\/s21030685"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ben Nasr, M.C., Ben Jebara, S., Otis, S., Abdulrazak, B., and Mezghani, N. (2021). A Spectral-Based Approach for BCG Signal Content Classification. Sensors, 21.","DOI":"10.3390\/s21031020"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Herzog, N.J., and Magoulas, G.D. (2021). Brain asymmetry detection and machine learning classification for diagnosis of early Dementia. Sensors, 21.","DOI":"10.3390\/s21030778"},{"key":"ref_21","unstructured":"Granger, E., Kiran, M., and Blais-Morin, L.A. (December, January 28). A comparison of cnn-based face and head detectors for real-time video surveillance applications. Proceedings of the 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), Montreal, QC, Canada."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"El Ahmar, W.A., Nowruzi, F.E., and Laganiere, R. (2020, January 14\u201319). Fast Human Head and Shoulder Detection Using Convolutional Networks and RGBD Data. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00061"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Vu, T.H., Osokin, A., and Laptev, I. (2015, January 7\u201313). Context-aware CNNs for person head detection. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.331"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Saqib, M., Khan, S.D., Sharma, N., and Blumenstein, M. (2018, January 8\u201313). Person head detection in multiple scales using deep convolutional neural networks. Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil.","DOI":"10.1109\/IJCNN.2018.8489367"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Peng, D., Sun, Z., Chen, Z., Cai, Z., Xie, L., and Jin, L. (2018, January 20\u201324). Detecting heads using feature refine net and cascaded multi-scale architecture. Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China.","DOI":"10.1109\/ICPR.2018.8545068"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2127","DOI":"10.1007\/s00371-020-01974-7","article-title":"Scale and density invariant head detection deep model for crowd counting in pedestrian crowds","volume":"37","author":"Khan","year":"2020","journal-title":"Vis. Comput."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wang, Y., Yin, Y., Wu, W., Sun, S., and Wang, X. (2017, January 5\u20138). Robust person head detection based on multi-scale representation fusion of deep convolution neural network. Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), Macao, China.","DOI":"10.1109\/ROBIO.2017.8324433"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"98679","DOI":"10.1109\/ACCESS.2020.2995764","article-title":"Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework","volume":"8","author":"Khan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Stewart, R., Andriluka, M., and Ng, A.Y. (2016, January 27\u201330). End-to-end people detection in crowded scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.255"},{"key":"ref_30","unstructured":"Vora, A., and Chilaka, V. (2018). FCHD: Fast and accurate head detection in crowded scenes. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"8843113","DOI":"10.1155\/2020\/8843113","article-title":"Person detection for an orthogonally placed monocular camera","volume":"2020","author":"Skrabanek","year":"2020","journal-title":"J. Adv. Transp."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Khan, S.D., Ullah, H., Ullah, M., Conci, N., Cheikh, F.A., and Beghdadi, A. (2019, January 18\u201321). Person head detection based deep model for people counting in sports videos. Proceedings of the 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Taipei, Taiwan.","DOI":"10.1109\/AVSS.2019.8909898"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yudin, D., Ivanov, A., and Shchendrygin, M. (2019). Detection of a human head on a low-quality image and its software implementation. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci., 42.","DOI":"10.5194\/isprs-archives-XLII-2-W12-237-2019"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Le, C., Ma, H., Wang, X., and Li, X. (2018, January 7\u201310). Key parts context and scene geometry in human head detection. Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece.","DOI":"10.1109\/ICIP.2018.8451832"},{"key":"ref_35","unstructured":"Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., and Zou, X. (2020, January 7\u20138). Relational learning for joint head and human detection. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.patcog.2017.04.018","article-title":"The connected-component labeling problem: A review of state-of-the-art algorithms","volume":"70","author":"He","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_39","unstructured":"(2021, June 01). Intel RealSense Depth Camera D435. Available online: https:\/\/www.intelrealsense.com\/depth-camera-d435\/."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/17\/5848\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:55:59Z","timestamp":1760165759000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/17\/5848"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,30]]},"references-count":39,"journal-issue":{"issue":"17","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["s21175848"],"URL":"https:\/\/doi.org\/10.3390\/s21175848","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,30]]}}}