{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,17]],"date-time":"2026-05-17T15:07:08Z","timestamp":1779030428161,"version":"3.51.4"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T00:00:00Z","timestamp":1778544000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/legalcode"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Cyber-Phys. Syst."],"published-print":{"date-parts":[[2026,7,31]]},"abstract":"<jats:p>The newly proposed multimodal transformer architecture offers a new paradigm for UAV detection and aerial object recognition. It introduces an innovative way of feeding multiple data streams, such as audio, infrared video, RGB video, and radar, into the architecture for processing, using independent modalities. The unique features of each modality are attached and processed together in the architecture, where the features are then exposed to the multimodal transformer for classification. Thus, all complementary information can be pooled within the integration framework to allow the model discrimination of any drone target under outdoor conditions from other aerial objects such as birds, helicopters, and airplanes. These methodologies are expected to outperform traditional single-modality systems by improving detection accuracy through class balancing and addressing modality-specific limitations. The proposed model has been further tested through various experiments to evaluate its robustness under conditions such as missing entries, corrupted data, and synthetic inputs. The results suggest that it has strong potential to serve as a benchmark in UAV detection. Thus, this work takes part of an emerging body of sensor fusion and deep learning-related research, demonstrating the potential of multimodal data in real-world detection problems.<\/jats:p>","DOI":"10.1145\/3807778","type":"journal-article","created":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T13:47:52Z","timestamp":1775742472000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["A Multimodal Transformer Approach for UAV Detection and Aerial Object Recognition Using Radar, Audio, and Video Data"],"prefix":"10.1145","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4963-4625","authenticated-orcid":false,"given":"Mauro","family":"Larrat","sequence":"first","affiliation":[{"name":"Institute of Exact and Natural Sciences, Federal University of Para, Bel\u00e9m, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2735-1383","authenticated-orcid":false,"given":"Claudomiro","family":"Sales","sequence":"additional","affiliation":[{"name":"Institute of Exact and Natural Sciences, Federal University of Para, Bel\u00e9m, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,5,12]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3628797.3628813"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3616394.3618265"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3647444.3647856"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3631588.3631597"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3560905.3568515"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3560905.3568499"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.3390\/app14031075"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSEN.2024.3449440"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.3390\/s24092701"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3641282"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3652628.3652813"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3610419.3610487"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3583074"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3677779.3677838"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3579999"},{"key":"e_1_3_2_17_2","unstructured":"J. Xiao P. Pisutsin C. W. Tsao and M. Feroskhan. 2024. Clustering-based learning for uav tracking and pose estimation. arXiv:2405.16867. Retrieved from https:\/\/arxiv.org\/abs\/2405.16867"},{"key":"e_1_3_2_18_2","first-page":"16","volume-title":"Building a Mutually Complementary Supply Chain between Japan and the United States: Pathways to Deepening Japan-U.S. Defense Equipment and Technology Cooperation","author":"Johnstone C. B.","year":"2024","unstructured":"C. B. Johnstone, C. R. Cook, A. Aldisert, L. Klaas, C. Michienzi, G. Rubinstein, G. Sanders, and N. Szechenyi. (Eds.). 2024. Case study two: Electro-optical sensors. In Building a Mutually Complementary Supply Chain between Japan and the United States: Pathways to Deepening Japan-U.S. Defense Equipment and Technology Cooperation. Center for Strategic and International Studies (CSIS), 16\u201320. Retrieved from http:\/\/www.jstor.org\/stable\/resrep62403.6"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597060.3597236"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3678549"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.3390\/rs14143326"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSEN.2022.3141213"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.3390\/s23218901"},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TIM.2022.3188050","article-title":"Diat-radsatnet\u2014A novel lightweight dcnn architecture for micro-doppler-based small unmanned aerial vehicle (suav) targets\u2019 detection and classification","volume":"71","author":"Kumawat H.","year":"2022","unstructured":"H. Kumawat, M. Chakraborty, and A. A. B. Raj. 2022. Diat-radsatnet\u2014A novel lightweight dcnn architecture for micro-doppler-based small unmanned aerial vehicle (suav) targets\u2019 detection and classification. IEEE Transactions on Instrumentation and Measurement 71 (2022), 1\u201311.","journal-title":"IEEE Transactions on Instrumentation and Measurement"},{"key":"e_1_3_2_25_2","unstructured":"S. Kang H. Forsten V. Semkin and S. Rangan. 2024. Millimeter wave 60 ghz radar measurements: Uas and birds. IEEE Dataport Techinical Report."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3638985.3639013"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3061896"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414985"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICITIIT57246.2023.10068637"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3643833.3656133"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3645259.3645265"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.3390\/signals4020018"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.3390\/drones7010026"},{"key":"e_1_3_2_34_2","unstructured":"H. Kumawat M. Chakraborty A. A. B. Raj and S. V. Dhavale. 2022. Diat-\u00b5sat: Micro-doppler signature dataset of small unmanned aerial vehicle (suav). IEEE Dataport Technical Report."},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","unstructured":"M. Shukor E. Fini V. G. T. da Costa M. Cord J. Susskind and A. El-Nouby. 2025. Scaling laws for native multimodal models. arXiv:2504.07951. Retrieved from https:\/\/arxiv.org\/abs\/2504.07951","DOI":"10.1109\/ICCV51701.2025.00009"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597060.3597237"},{"issue":"1","key":"e_1_3_2_37_2","first-page":"1","article-title":"Rtm-uavdet: A real-time multimodal uav detector","volume":"61","author":"Wang Guorui","year":"2024","unstructured":"Guorui Wang, Qian Jiang, Xin Jin, Michal Wozniak, Puming Wang, and Shaowen Yao. 2024. Rtm-uavdet: A real-time multimodal uav detector. IEEE Transactions on Aerospace and Electronic Systems 61, 1 (2024), 1\u201319.","journal-title":"IEEE Transactions on Aerospace and Electronic Systems"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/IRI58017.2023.00025"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1002\/aisy.202300251"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dib.2021.107521"},{"key":"e_1_3_2_41_2","doi-asserted-by":"crossref","unstructured":"T. A. Sjaardema C. S. Smith and G. C. Birch. 2015. History and evolution of the Johnson criteria. Technical Report. Sandia National Lab.(SNL-NM) Albuquerque NM.","DOI":"10.2172\/1222446"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.3390\/fi15080260"}],"container-title":["ACM Transactions on Cyber-Physical Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3807778","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,17]],"date-time":"2026-05-17T14:36:36Z","timestamp":1779028596000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3807778"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,5,12]]},"references-count":41,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,7,31]]}},"alternative-id":["10.1145\/3807778"],"URL":"https:\/\/doi.org\/10.1145\/3807778","relation":{},"ISSN":["2378-962X","2378-9638"],"issn-type":[{"value":"2378-962X","type":"print"},{"value":"2378-9638","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,5,12]]},"assertion":[{"value":"2024-12-20","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-04-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-05-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}