{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,1]],"date-time":"2026-01-01T10:06:12Z","timestamp":1767261972644,"version":"build-2065373602"},"reference-count":26,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2023,4,14]],"date-time":"2023-04-14T00:00:00Z","timestamp":1681430400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"FCT\u2013Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","award":["UIDB\/00319\/2020"],"award-info":[{"award-number":["UIDB\/00319\/2020"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Multi-human detection and tracking in indoor surveillance is a challenging task due to various factors such as occlusions, illumination changes, and complex human-human and human-object interactions. In this study, we address these challenges by exploring the benefits of a low-level sensor fusion approach that combines grayscale and neuromorphic vision sensor (NVS) data. We first generate a custom dataset using an NVS camera in an indoor environment. We then conduct a comprehensive study by experimenting with different image features and deep learning networks, followed by a multi-input fusion strategy to optimize our experiments with respect to overfitting. Our primary goal is to determine the best input feature types for multi-human motion detection using statistical analysis. We find that there is a significant difference between the input features of optimized backbones, with the best strategy depending on the amount of available data. Specifically, under a low-data regime, event-based frames seem to be the preferred input feature type, while higher data availability benefits the combined use of grayscale and optical flow features. Our results demonstrate the potential of sensor fusion and deep learning techniques for multi-human tracking in indoor surveillance, although it is acknowledged that further studies are needed to confirm our findings.<\/jats:p>","DOI":"10.3390\/s23083993","type":"journal-article","created":{"date-parts":[[2023,4,14]],"date-time":"2023-04-14T09:23:43Z","timestamp":1681464223000},"page":"3993","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Sensor Fusion Approach for Multiple Human Motion Detection for Indoor Surveillance Use-Case"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5581-1279","authenticated-orcid":false,"given":"Ali","family":"Abbasi","sequence":"first","affiliation":[{"name":"Algorithmic Center, University of Minho, 4800-058 Azur\u00e9m, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5259-1891","authenticated-orcid":false,"given":"Sandro","family":"Queir\u00f3s","sequence":"additional","affiliation":[{"name":"School of Medicine, University of Minho, 4710-057 Gualtar, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8425-3501","authenticated-orcid":false,"given":"Nuno M. C.","family":"da Costa","sequence":"additional","affiliation":[{"name":"Algorithmic Center, University of Minho, 4800-058 Azur\u00e9m, Portugal"},{"name":"2Ai-School of Technology, IPCA, 4750-810 Barcelos, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6703-3278","authenticated-orcid":false,"given":"Jaime C.","family":"Fonseca","sequence":"additional","affiliation":[{"name":"Algorithmic Center, University of Minho, 4800-058 Azur\u00e9m, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5880-033X","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Borges","sequence":"additional","affiliation":[{"name":"Algorithmic Center, University of Minho, 4800-058 Azur\u00e9m, Portugal"},{"name":"2Ai-School of Technology, IPCA, 4750-810 Barcelos, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2023,4,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/s40537-019-0212-5","article-title":"Intelligent video surveillance: A review through deep learning techniques for crowd analysis","volume":"6","author":"Sreenu","year":"2019","journal-title":"J. Big Data"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"865","DOI":"10.1109\/TSMCC.2011.2178594","article-title":"Video-based abnormal human behavior recognition\u2014A review","volume":"42","author":"Popoola","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ibrahim, S., and Warsono, S. (2016). A comprehensive review on intelligent surveillance systems. Commun. Sci. Technol., 1.","DOI":"10.21924\/cst.1.1.2016.7"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"103116","DOI":"10.1016\/j.jvcir.2021.103116","article-title":"A review of video surveillance systems","volume":"77","author":"Elharrouss","year":"2021","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"704504","DOI":"10.1155\/2013\/704504","article-title":"A review of data fusion technique","volume":"2013","author":"Castenedo","year":"2013","journal-title":"Sci. World J."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/j.jvcir.2018.06.006","article-title":"An effective motion object detection method using optical flow estimation under a moving camera","volume":"55","author":"Zhang","year":"2018","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_7","unstructured":"Huang, J., Zou, W., Zhu, J., and Zhu, Z. (2018). Optical flow based real-time moving object detection in unconstrained scenes. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"114544","DOI":"10.1016\/j.eswa.2020.114544","article-title":"Optical-flow-based framework to boost video object detection performance with object enhancement","volume":"170","author":"Fan","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_9","first-page":"203","article-title":"Human motion detection and tracking for real-time security system","volume":"5","author":"Shaalini","year":"2013","journal-title":"Int. J. Adv. Res. Comput. Sci. Softw. Eng."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Cho, J., Jung, Y., Kim, D.-S., Lee, S., and Jung, Y. (2019). Moving object detection based on optical flow estimation and a Gaussian mixture model for advanced driver assistance systems. Sensors, 19.","DOI":"10.3390\/s19143217"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Jo, K., Im, J., Kim, J., and Kim, D.S. (2017, January 12\u201314). A real-time multi-class multi-object tracker using YOLOv2. Proceedings of the 2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), Kuching, Malaysia.","DOI":"10.1109\/ICSIPA.2017.8120665"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"3981","DOI":"10.1007\/s11042-020-09749-x","article-title":"Real time object detection and trackingsystem for video surveillance system","volume":"80","author":"Jha","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (ICCV), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Han, R., and Rao, Y. (2019, January 14\u201315). A new feature pyramid network for object detection. Proceedings of the 2019 International Conference on Virtual Reality and Intelligent Systems (ICVRIS), Jishou, China.","DOI":"10.1109\/ICVRIS.2019.00110"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_19","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_20","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_21","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). European Conference on Computer Vision, Springer."},{"key":"ref_22","unstructured":"Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., and Berg, A.C. (2017). Dssd: Deconvolutional single shot detector. arXiv."},{"key":"ref_23","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_24","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2020, January 13\u201319). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, Seattle, WA, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (2017, January 21\u201326). RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"3069","DOI":"10.1007\/s11263-021-01513-4","article-title":"FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking","volume":"129","author":"Zhang","year":"2021","journal-title":"Int. J. Comput. Vis."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/8\/3993\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:16:06Z","timestamp":1760123766000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/8\/3993"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,14]]},"references-count":26,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["s23083993"],"URL":"https:\/\/doi.org\/10.3390\/s23083993","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,4,14]]}}}