{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T11:32:51Z","timestamp":1770291171350,"version":"3.49.0"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2024,6,16]],"date-time":"2024-06-16T00:00:00Z","timestamp":1718496000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,16]],"date-time":"2024-06-16T00:00:00Z","timestamp":1718496000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Wuhan Knowledge Innovation Project","award":["2022020801010258"],"award-info":[{"award-number":["2022020801010258"]}]},{"DOI":"10.13039\/501100003819","name":"Natural Science Foundation of Hubei Province","doi-asserted-by":"publisher","award":["2023AFB424"],"award-info":[{"award-number":["2023AFB424"]}],"id":[{"id":"10.13039\/501100003819","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The aftermath of a natural disaster leaves victims trapped in rubble which is challenging to detect by smart drones due to the victims in low visibility under the adverse disaster environments and victims in various sizes. To overcome the above challenges, a transformer fusion-based scale-aware attention network (TFSANet) is proposed to overcome adverse environmental impacts in disaster areas by robustly integrating the latent interactions between RGB and thermal images and to address the problem of various-sized victim detection. Firstly, a transformer fusion model is developed to incorporate a two-stream backbone network to effectively fuse the complementary characteristics between RGB and thermal images. This aims to solve the problem that the victims cannot be seen clearly due to the adverse disaster area, such as smog and heavy rain. In addition, a scale-aware attention mechanism is designed to be embedded into the head network to adaptively adjust the size of receptive fields aiming to capture victims with different scales. Extensive experiments on two challenging datasets indicate that our TFSANet achieves superior results. The proposed method achieves 86.56% average precision (AP) on the National Institute of Informatics\u2014Chiba University (NII-CU) multispectral aerial person detection dataset, outperforming the state-of-the-art approach by 4.38%. On the drone-captured RGBT person detection (RGBTDronePerson) dataset, the proposed method significantly improves the AP of the state-of-the-art approach by 4.33%.<\/jats:p>","DOI":"10.1007\/s40747-024-01515-y","type":"journal-article","created":{"date-parts":[[2024,6,16]],"date-time":"2024-06-16T16:01:23Z","timestamp":1718553683000},"page":"6619-6632","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Transformer fusion-based scale-aware attention network for multispectral victim detection"],"prefix":"10.1007","volume":"10","author":[{"given":"Yunfan","family":"Chen","sequence":"first","affiliation":[]},{"given":"Yuting","family":"Li","sequence":"additional","affiliation":[]},{"given":"Wenqi","family":"Zheng","sequence":"additional","affiliation":[]},{"given":"Xiangkui","family":"Wan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,6,16]]},"reference":[{"issue":"1","key":"1515_CR1","first-page":"1","volume":"3","author":"RD Arnold","year":"2018","unstructured":"Arnold RD, Yamaguchi H, Tanaka T (2018) Search and rescue with autonomous flying robots through behavior-based cooperative intelligence. J Int Hum Act 3(1):1\u201318","journal-title":"J Int Hum Act"},{"key":"1515_CR2","doi-asserted-by":"crossref","unstructured":"Hwang S, Park J, Kim N, et al. (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037\u20131045","DOI":"10.1109\/CVPR.2015.7298706"},{"key":"1515_CR3","first-page":"509","volume":"587","author":"J Wagner","year":"2016","unstructured":"Wagner J, Fischer V, Herman M et al (2016) Multispectral pedestrian detection using deep fusion convolutional neural networks. ESANN 587:509\u2013514","journal-title":"ESANN"},{"key":"1515_CR4","doi-asserted-by":"crossref","unstructured":"Liu J, Zhang S, Wang S et al. (2016) Multispectral deep neural networks for pedestrian detection. arXiv preprint https:\/\/arXiv.org\/1611.02644","DOI":"10.5244\/C.30.73"},{"issue":"8","key":"1515_CR5","doi-asserted-by":"publisher","first-page":"1179","DOI":"10.1049\/iet-cvi.2018.5315","volume":"12","author":"Y Chen","year":"2018","unstructured":"Chen Y, Xie H, Shin H (2018) Multi-layer fusion techniques using a CNN for multispectral pedestrian detection. IET Comput Vision 12(8):1179\u20131187","journal-title":"IET Comput Vision"},{"key":"1515_CR6","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1016\/j.patcog.2018.08.005","volume":"85","author":"C Li","year":"2019","unstructured":"Li C, Song D, Tong R et al (2019) Illumination-aware faster R-CNN for robust multispectral pedestrian detection. Pattern Recogn 85:161\u2013171","journal-title":"Pattern Recogn"},{"key":"1515_CR7","doi-asserted-by":"crossref","unstructured":"Zhou K, Chen L, Cao X (2020) Improving multispectral pedestrian detection by addressing modality imbalance problems. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XVIII 16. Springer International Publishing, pp 787\u2013803","DOI":"10.1007\/978-3-030-58523-5_46"},{"issue":"5","key":"1515_CR8","doi-asserted-by":"publisher","first-page":"768","DOI":"10.1364\/JOSAA.386410","volume":"37","author":"Y Chen","year":"2020","unstructured":"Chen Y, Shin H (2020) Multispectral image fusion based pedestrian detection using a multilayer fused deconvolutional single-shot detector. JOSA A 37(5):768\u2013779","journal-title":"JOSA A"},{"key":"1515_CR9","doi-asserted-by":"crossref","unstructured":"Zhang H, Fromont E, Lef\u00e8vre S et al. (2021) Guided attentive feature fusion for multispectral pedestrian detection. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 72\u201380","DOI":"10.1109\/WACV48630.2021.00012"},{"key":"1515_CR10","unstructured":"Li C, Song D, Tong R et al. (2018) Multispectral pedestrian detection via simultaneous detection and segmentation. arXiv preprint https:\/\/arXiv.org\/1808.04818"},{"key":"1515_CR11","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1016\/j.isprsjprs.2019.02.005","volume":"150","author":"Y Cao","year":"2019","unstructured":"Cao Y, Guan D, Wu Y et al (2019) Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection. ISPRS J Photogramm Remote Sens 150:70\u201379","journal-title":"ISPRS J Photogramm Remote Sens"},{"key":"1515_CR12","doi-asserted-by":"crossref","unstructured":"Zhang H, Fromont E, Lef\u00e8vre S et al. (2022) Low-cost multispectral scene analysis with modality distillation. In: Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, pp 803\u2013812","DOI":"10.1109\/WACV51458.2022.00339"},{"issue":"3","key":"1515_CR13","doi-asserted-by":"publisher","first-page":"2935","DOI":"10.1007\/s11063-022-10991-7","volume":"55","author":"X Zuo","year":"2023","unstructured":"Zuo X, Wang Z, Liu Y et al (2023) LGADet: light-weight anchor-free multispectral pedestrian detection with mixed local and global attention. Neural Process Lett 55(3):2935\u20132952","journal-title":"Neural Process Lett"},{"key":"1515_CR14","doi-asserted-by":"crossref","unstructured":"Zhang L, Zhu X, Chen X et al. (2019) Weakly aligned cross-modal learning for multispectral pedestrian detection. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 5127\u20135137","DOI":"10.1109\/ICCV.2019.00523"},{"key":"1515_CR15","doi-asserted-by":"crossref","unstructured":"Wanchaitanawong N, Tanaka M, Shibata T et al. (2021) Multi-modal pedestrian detection with large misalignment based on modal-wise regression and multi-modal IoU. In: 2021 17th International Conference on Machine Vision and Applications (MVA). IEEE, pp 1\u20136","DOI":"10.23919\/MVA51890.2021.9511366"},{"key":"1515_CR16","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2023.110768","volume":"147","author":"W Hu","year":"2023","unstructured":"Hu W, Fu C, Cao R et al (2023) Joint dual-stream interaction and multi-scale feature extraction network for multi-spectral pedestrian detection. Appl Soft Comput 147:110768","journal-title":"Appl Soft Comput"},{"issue":"1","key":"1515_CR17","doi-asserted-by":"publisher","first-page":"90","DOI":"10.3390\/su13010090","volume":"13","author":"M Dehghani","year":"2020","unstructured":"Dehghani M, Ghiasi M, Niknam T et al (2020) Blockchain-based securing of data exchange in a power transmission system considering congestion management and social welfare. Sustainability 13(1):90","journal-title":"Sustainability"},{"key":"1515_CR18","doi-asserted-by":"publisher","DOI":"10.1016\/j.epsr.2022.108975","volume":"215","author":"M Ghiasi","year":"2023","unstructured":"Ghiasi M, Niknam T, Wang Z et al (2023) A comprehensive review of cyber-attacks and defense mechanisms for improving security in smart grid energy systems: past, present and future. Electr Power Syst Res 215:108975","journal-title":"Electr Power Syst Res"},{"key":"1515_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10614-017-9716-2","volume":"53","author":"P Akbary","year":"2019","unstructured":"Akbary P, Ghiasi M, Pourkheranjani MRR et al (2019) Extracting appropriate nodal marginal prices for all types of committed reserve. Comput Econ 53:1\u201326","journal-title":"Comput Econ"},{"issue":"1","key":"1515_CR20","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1007\/s42452-018-0049-0","volume":"1","author":"M Ghiasi","year":"2019","unstructured":"Ghiasi M, Ghadimi N, Ahmadinia E (2019) An analytical methodology for reliability assessment and failure analysis in distributed power system. SN Appl Sci 1(1):44","journal-title":"SN Appl Sci"},{"issue":"1","key":"1515_CR21","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1049\/stg2.12095","volume":"6","author":"M Ghiasi","year":"2023","unstructured":"Ghiasi M, Wang Z, Mehrandezh M et al (2023) Evolution of smart grids towards the Internet of energy: concept and essential components for deep decarbonisation. IET Smart Grid 6(1):86\u2013102","journal-title":"IET Smart Grid"},{"key":"1515_CR22","unstructured":"Abdel-Basset M et al (2022) Responsible system based on artificial intelligence to reduce greenhouse gas emissions in 6G networks. DE202022105964, Applicant\/Assignee: Zagazig University, Publication Number: 202022105964, Publication Date: 29.12.2022, WIPO, https:\/\/patentscope.wipo.int\/search\/en\/detail.jsf?docId=DE383466530&_cid=P12-LPNEZV-30550-1 Accessed on 09 Mar 2023"},{"issue":"6","key":"1515_CR23","doi-asserted-by":"publisher","first-page":"840","DOI":"10.1002\/rob.22082","volume":"39","author":"S Speth","year":"2022","unstructured":"Speth S, Goncalves A, Rigault B et al (2022) Deep learning with RGB and thermal images onboard a drone for monitoring operations. J Field Robot 39(6):840\u2013868","journal-title":"J Field Robot"},{"key":"1515_CR24","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1016\/j.isprsjprs.2023.08.016","volume":"204","author":"Y Zhang","year":"2023","unstructured":"Zhang Y, Xu C, Yang W et al (2023) Drone-based RGBT tiny person detection. ISPRS J Photogramm Remote Sens 204:61\u201376","journal-title":"ISPRS J Photogramm Remote Sens"},{"key":"1515_CR25","unstructured":"Ren S, He K, Girshick R, et al. (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28"},{"issue":"7","key":"1515_CR26","doi-asserted-by":"publisher","first-page":"2244","DOI":"10.3390\/s18072244","volume":"18","author":"DC De Oliveira","year":"2018","unstructured":"De Oliveira DC, Wehrmeister MA (2018) Using deep learning and low-cost RGB and thermal cameras to detect pedestrians in aerial images captured by multirotor UAV. Sensors 18(7):2244","journal-title":"Sensors"},{"key":"1515_CR27","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2020.106697","volume":"85","author":"TM Dawdi","year":"2020","unstructured":"Dawdi TM, Abdalla N, Elkalyoubi YM et al (2020) Locating victims in hot environments using combined thermal and optical imaging. Comput Electr Eng 85:106697","journal-title":"Comput Electr Eng"},{"issue":"12","key":"1515_CR28","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1038\/s42256-020-00261-3","volume":"2","author":"DC Schedl","year":"2020","unstructured":"Schedl DC, Kurmi I, Bimber O (2020) Search and rescue with airborne optical sectioning. Nat Mach Intell 2(12):783\u2013790","journal-title":"Nat Mach Intell"},{"key":"1515_CR29","unstructured":"Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint https:\/\/arXiv.org\/1804.02767"},{"issue":"1","key":"1515_CR30","volume":"4","author":"CC Ulloa","year":"2023","unstructured":"Ulloa CC, Garrido L, Del Cerro J et al (2023) Autonomous victim detection system based on deep learning and multispectral imagery. Mach Learn: Sci Technol 4(1):015018","journal-title":"Mach Learn: Sci Technol"},{"key":"1515_CR31","unstructured":"https:\/\/github.com\/ultralytics\/yolov5"},{"key":"1515_CR32","doi-asserted-by":"crossref","unstructured":"Zhang Y, Chen J, Huang D (2022) Cat-det: contrastively augmented transformer for multi-modal 3d object detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp 908\u2013917","DOI":"10.1109\/CVPR52688.2022.00098"},{"key":"1515_CR33","unstructured":"Qingyun F, Dapeng H, Zhaokui W (2021) Cross-modality fusion transformer for multispectral object detection. arXiv preprint https:\/\/arXiv.org\/2111.00273"},{"key":"1515_CR34","unstructured":"Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv preprint https:\/\/arXiv.org\/1606.08415"},{"key":"1515_CR35","doi-asserted-by":"crossref","unstructured":"Li X, Wang W, Hu X, et al. (2019) Selective kernel networks. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 510\u2013519","DOI":"10.1109\/CVPR.2019.00060"},{"key":"1515_CR36","doi-asserted-by":"crossref","unstructured":"Rezatofighi H, Tsoi N, Gwak J Y, et al. (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 658\u2013666","DOI":"10.1109\/CVPR.2019.00075"},{"key":"1515_CR37","unstructured":"Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint https:\/\/arXiv.org\/2004.10934"},{"key":"1515_CR38","doi-asserted-by":"crossref","unstructured":"Lin T Y, Maire M, Belongie S et al. (2014) Microsoft coco: common objects in context. In: Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6\u201312, 2014, Proceedings, Part V 13. Springer International Publishing, pp 740\u2013755","DOI":"10.1007\/978-3-319-10602-1_48"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01515-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01515-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01515-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,14]],"date-time":"2024-09-14T15:13:28Z","timestamp":1726326808000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01515-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,16]]},"references-count":38,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["1515"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01515-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,16]]},"assertion":[{"value":"28 September 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 May 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 June 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}