{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T20:47:31Z","timestamp":1775076451162,"version":"3.50.1"},"reference-count":29,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2023,12,31]],"date-time":"2023-12-31T00:00:00Z","timestamp":1703980800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Despite significant strides in achieving vehicle autonomy, robust perception under low-light conditions still remains a persistent challenge. In this study, we investigate the potential of multispectral imaging, thereby leveraging deep learning models to enhance object detection performance in the context of nighttime driving. Features encoded from the red, green, and blue (RGB) visual spectrum and thermal infrared images are combined to implement a multispectral object detection model. This has proven to be more effective compared to using visual channels only, as thermal images provide complementary information when discriminating objects in low-illumination conditions. Additionally, there is a lack of studies on effectively fusing these two modalities for optimal object detection performance. In this work, we present a framework based on the Faster R-CNN architecture with a feature pyramid network. Moreover, we design various fusion approaches using concatenation and addition operators at varying stages of the network to analyze their impact on object detection performance. Our experimental results on the KAIST and FLIR datasets show that our framework outperforms the baseline experiments of the unimodal input source and the existing multispectral object detectors.<\/jats:p>","DOI":"10.3390\/jimaging10010012","type":"journal-article","created":{"date-parts":[[2023,12,31]],"date-time":"2023-12-31T04:51:51Z","timestamp":1703998311000},"page":"12","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Multispectral Deep Neural Network Fusion Method for Low-Light Object Detection"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1313-182X","authenticated-orcid":false,"given":"Keval","family":"Thaker","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering, University of Michigan-Dearborn, Dearborn, MI 48128, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3382-540X","authenticated-orcid":false,"given":"Sumanth","family":"Chennupati","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering, University of Michigan-Dearborn, Dearborn, MI 48128, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9118-9317","authenticated-orcid":false,"given":"Nathir","family":"Rawashdeh","sequence":"additional","affiliation":[{"name":"Department of Applied Computing, Michigan Technological University, Houghton, MI 49931, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3473-6978","authenticated-orcid":false,"given":"Samir A.","family":"Rawashdeh","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering, University of Michigan-Dearborn, Dearborn, MI 48128, USA"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"031202","DOI":"10.1117\/1.OE.62.3.031202","article-title":"Camera\u2013lidar sensor fusion for drivable area detection in winter weather using convolutional neural networks","volume":"62","author":"Rawashdeh","year":"2022","journal-title":"Opt. Eng."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Abu-Shaqra, A., Abu-Alrub, N., and Rawashdeh, N.A. (2022, January 3\u20137). Object detection in degraded lidar signals by synthetic snowfall noise for autonomous driving. Proceedings of the Autonomous Systems: Sensors, Processing and Security for Ground, Air, Sea and Space Vehicles and Infrastructure 2022, Orlando, FL, USA.","DOI":"10.1117\/12.2617569"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.4271\/2020-01-0104","article-title":"Effect of adherent rain on vision-based object detection algorithms","volume":"2","author":"Hamzeh","year":"2020","journal-title":"SAE Int. J. Adv. Curr. Pract. Mobil."},{"key":"ref_4","unstructured":"(2021, July 10). Pedestrian Traffic Fatalities by State: 2020 Preliminary Data. Available online: https:\/\/www.ghsa.org\/Pedestrians21."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Zhai, B., Wang, G., and Lin, J. (2023). Pedestrian Detection Method Based on Two-Stage Fusion of Visible Light Image and Thermal Infrared Image. Electronics, 12.","DOI":"10.3390\/electronics12143171"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hwang, S., Park, J., Kim, N., Choi, Y., and Kweon, I.S. (2015, January 7\u201312). Multispectral pedestrian detection: Benchmark dataset and Baseline. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298706"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.inffus.2016.05.004","article-title":"Pixel-level image fusion: A survey of the state of the art","volume":"33","author":"Li","year":"2017","journal-title":"Inf. Fusion"},{"key":"ref_8","unstructured":"Choi, E.-J., and Park, D.-J. (2010, January 19\u201320). Human detection using image fusion of thermal and visible image with new joint bilateral filter. Proceedings of the 5th International Conference on Computer Sciences and Convergence Information Technology, Seoul, Republic of Korea."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1532","DOI":"10.1109\/TPAMI.2014.2300479","article-title":"Fast feature pyramids for object detection","volume":"36","author":"Dollar","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_10","unstructured":"Torresan, H., Turgeon, B., Ibarra-Castanedo, C., Hebert, P., and Maldague, X.P. (2004). Thermosense XXVI, SPIE."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards real-time object detection with region proposal networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1096","DOI":"10.1109\/TCSVT.2008.928217","article-title":"Person surveillance using visual and infrared imagery","volume":"18","author":"Krotosky","year":"2008","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Davis, J.W., and Keck, M.A. (2005, January 5\u20137). A two-stage template approach to person detection in thermal imagery. Proceedings of the 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV\/MOTION\u201905), Breckenridge, CO, USA.","DOI":"10.1109\/ACVMOT.2005.14"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Teutsch, M., Mueller, T., Huber, M., and Beyerer, J. (2014, January 23\u201328). Low resolution person detection with a moving thermal infrared camera by Hot Spot Classification. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA.","DOI":"10.1109\/CVPRW.2014.40"},{"key":"ref_18","unstructured":"Wagner, J., Fischer, V., Herman, M., and Behnke, S. (2016, January 27\u201329). Multi-spectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks. Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgium."},{"key":"ref_19","unstructured":"Choi, H., Kim, S., Park, K., and Sohn, K. (2016, January 4\u20138). Multi-spectral pedestrian detection based on accumulated object proposal with fully convolutional networks. Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.patcog.2018.08.005","article-title":"Illumination-aware faster R-CNN for robust multispectral pedestrian detection","volume":"85","author":"Li","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Xu, D., Ouyang, W., Ricci, E., Wang, X., and Sebe, N. (2017, January 21\u201326). Learning cross-modal deep representations for robust pedestrian detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.451"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Devaguptapu, C., Akolekar, N., Sharma, M.M., and Balasubramanian, V.N. (2019, January 16\u201317). Borrow from anywhere: Pseudo Multi-Modal Object Detection in thermal imagery. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00135"},{"key":"ref_23","unstructured":"Samir, R., Rashed, A., Yogamani, H., and Dahyot, S.R. (2020, January 31). CNN based Color and Thermal Image Fusion for Object Detection in Automated Driving. Proceedings of the Irish Machine Vision and Image Processing (IMVIP 2020), Sligo, Ireland."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liu, S.W.J., Zhang, S., and Metaxas, D. (2016, January 19\u201322). Multispectral deep neural networks for pedestrian detection. Proceedings of the British Machine Vision Conference, York, UK.","DOI":"10.5244\/C.30.73"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Konig, D., Adam, M., Jarvers, C., Layher, G., Neumann, H., and Teutsch, M. (2017, January 21\u201326). Fully convolutional region proposal networks for Multispectral Person Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.36"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Muhammad, M.B., and Yeasin, M. (2020, January 19\u201324). Eigen-cam: Class activation map using principal components. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9206626"},{"key":"ref_27","unstructured":"Hendrycks, D., and Dietterich, T. (2019, January 6\u20139). Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA."},{"key":"ref_28","unstructured":"Michaelis, C., Mitzkus, B., Geirhos, R., Rusak, E., Bringmann, O., Ecker, A.S., Bethge, M., and Brendel, W. (2019, January 8\u201314). Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming. Proceedings of the Machine Learning for Autonomous Driving Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.inffus.2018.11.017","article-title":"Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection","volume":"50","author":"Guan","year":"2019","journal-title":"Inf. Fusion"}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/10\/1\/12\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:45:06Z","timestamp":1760132706000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/10\/1\/12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,31]]},"references-count":29,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,1]]}},"alternative-id":["jimaging10010012"],"URL":"https:\/\/doi.org\/10.3390\/jimaging10010012","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,31]]}}}