{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,14]],"date-time":"2026-07-14T23:10:05Z","timestamp":1784070605222,"version":"3.55.0"},"reference-count":147,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T00:00:00Z","timestamp":1768521600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>The integration of large models and multimodal foundation models into the low-altitude economy is driving a transformative shift, enabling intelligent, autonomous, and efficient operations for low-altitude vehicles (LAVs). This article provides a comprehensive analysis of the role these large models play within the smart integrated lower airspace system (SILAS), focusing on their applications across the four fundamental networks: facility, information, air route, and service. Our analysis yields several key findings, which pave the way for enhancing the application of large models in the low-altitude economy. By leveraging advanced capabilities in perception, reasoning, and interaction, large models are demonstrated to enhance critical functions such as high-precision remote sensing interpretation, robust meteorological forecasting, reliable visual localization, intelligent path planning, and collaborative multi-agent decision-making. Furthermore, we find that the integration of these models with key enabling technologies, including edge computing, sixth-generation (6G) communication networks, and integrated sensing and communication (ISAC), effectively addresses challenges related to real-time processing, resource constraints, and dynamic operational environments. Significant challenges, including sustainable operation under severe resource limitations, data security, network resilience, and system interoperability, are examined alongside potential solutions. Based on our survey, we discuss future research directions, such as the development of specialized low-altitude models, high-efficiency deployment paradigms, advanced multimodal fusion, and the establishment of trustworthy distributed intelligence frameworks. This survey offers a forward-looking perspective on this rapidly evolving field and underscores the pivotal role of large models in unlocking the full potential of the next-generation low-altitude economy.<\/jats:p>","DOI":"10.3390\/bdcc10010033","type":"journal-article","created":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T08:08:21Z","timestamp":1768550901000},"page":"33","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Large Model in Low-Altitude Economy: Applications and Challenges"],"prefix":"10.3390","volume":"10","author":[{"given":"Jinpeng","family":"Hu","sequence":"first","affiliation":[{"name":"Spatial Information Technology Application Department, Changjiang River Scientific Research Institute, Wuhan 430010, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wei","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuxiao","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4463-0632","authenticated-orcid":false,"given":"Jing","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2026,1,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"6659","DOI":"10.1109\/JIOT.2024.3491796","article-title":"Unauthorized UAV Countermeasure for Low-Altitude Economy: Joint Communications and Jamming Based on MIMO Cellular Systems","volume":"12","author":"Li","year":"2025","journal-title":"IEEE Internet Things J."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"103377","DOI":"10.1016\/j.trc.2021.103377","article-title":"Urban air mobility: A comprehensive review and comparative analysis with autonomous and electric ground transportation for informing future research","volume":"132","author":"Garrow","year":"2021","journal-title":"Transp. Res. Part Emerg. Technol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"14438","DOI":"10.1109\/JIOT.2023.3268316","article-title":"AI for UAV-Assisted IoT Applications: A Comprehensive Review","volume":"10","author":"Cheng","year":"2023","journal-title":"IEEE Internet Things J."},{"key":"ref_4","first-page":"28809","article-title":"Low-Altitude Intelligent Network Networking and Control Theories and Methods (in Chinese)","volume":"45","author":"Wu","year":"2024","journal-title":"Acta Aeronaut. Astronaut. Sin."},{"key":"ref_5","unstructured":"Solomon, Y. (2025, October 12). With 1 Announcement, the FAA Just Created an $82 Billion Market and 100,000 New Jobs, 2016. Available online: https:\/\/www.inc.com\/yoram-solomon\/with-one-rule-the-faa-just-created-an-82-billion-market-and-100000-new-jobs.html."},{"key":"ref_6","unstructured":"Federal Aviation Administration (FAA) (2020). Unmanned Aircraft System (UAS) Traffic Management (UTM) Concept of Operations v2.0, Technical Report."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lieb, J., and Volkert, A. (2020, January 11\u201315). Unmanned Aircraft Systems Traffic Management: A comparsion on the FAA UTM and the European CORUS ConOps based on U-space. Proceedings of the 2020 AIAA\/IEEE 39th Digital Avionics Systems Conference (DASC), Virtual.","DOI":"10.1109\/DASC50938.2020.9256745"},{"key":"ref_8","unstructured":"Chen, Y., Yu, C., Wu, T., Lin, B., Ma, L., Li, Y., Chen, M., and Shi, J. (2025). Low-Altitude Economy Scenario White Paper (English Edition), Chinese Society of Aeronautics and Astronautics (CSAA). White Paper."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Baum, M.S. (2021). Unmanned Aircraft Systems Traffic Management: UTM, CRC Press. [1st ed.].","DOI":"10.1201\/9781003124689"},{"key":"ref_10","unstructured":"SESAR Joint Undertaking (SJU) (2025). SESAR Master Plan 2025: Accelerating the Digital European Sky, SESAR Joint Undertaking. Strategic roadmap for the digital transformation of European air traffic management under the Single European Sky initiative."},{"key":"ref_11","unstructured":"SESAR Joint Undertaking (SJU) (2019). Proposal for the Future Architecture of the European Airspace, Publications Office of the European Union. Developed under the Delegation Agreement MOVE\/E3\/DA\/2017-477\/SI2.766828 between the European Commission and the SESAR Joint Undertaking."},{"key":"ref_12","unstructured":"3rd Generation Partnership Project (3GPP) (2025, October 12). 3GPP TS 38.211: NR; Physical Channels and Modulation (Release 18). Technical Report 38.211, 3rd Generation Partnership Project (3GPP). Available online: https:\/\/portal.3gpp.org\/desktopmodules\/Specifications\/SpecificationDetails.aspx?specificationId=3231."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1109\/MNET.011.2000493","article-title":"Non-Terrestrial Networks in the 6G Era: Challenges and Opportunities","volume":"35","author":"Giordani","year":"2021","journal-title":"IEEE Netw."},{"key":"ref_14","unstructured":"International Telecommunication Union (ITU) (2025, October 12). ITU-R M.2171: Characteristics of Terrestrial IMT-Advanced Systems for Frequency Sharing\/Interference Analyses. Technical Report M.2171, International Telecommunication Union, Radiocommunication Sector (ITU-R). Available online: https:\/\/www.itu.int\/pub\/R-REP-M.2171."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"207201","DOI":"10.1007\/s11432-022-3516-2","article-title":"IEEE Standard Pioneered an IT-Led Interdisciplinary Approach to Structure Low-Altitude Airspace for UAV Operations","volume":"65","author":"Xu","year":"2022","journal-title":"Sci. China Inf. Sci."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1728","DOI":"10.1109\/JSAC.2022.3156632","article-title":"Integrated Sensing and Communications: Toward Dual-Functional Wireless Networks for 6G and Beyond","volume":"40","author":"Liu","year":"2022","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3327","DOI":"10.1109\/JSAC.2024.3492720","article-title":"Space-Air-Ground Integrated Wireless Networks for 6G: Basics, Key Technologies, and Future Trends","volume":"42","author":"Xiao","year":"2024","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"90","DOI":"10.23919\/JCC.2022.02.008","article-title":"Space-air-ground integrated network (SAGIN) for 6G: Requirements, architecture and challenges","volume":"19","author":"Cui","year":"2022","journal-title":"China Commun."},{"key":"ref_19","first-page":"1","article-title":"A Survey on Channel Sounding Technologies and Measurements for UAV-Assisted Communications","volume":"73","author":"Mao","year":"2024","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"14477","DOI":"10.1109\/TITS.2025.3569500","article-title":"A Survey on Autonomous and Intelligent Swarms of Uncrewed Aerial Vehicles (UAVs)","volume":"26","author":"Du","year":"2025","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1867","DOI":"10.1109\/COMST.2024.3471671","article-title":"Trajectory-Prediction Techniques for Unmanned Aerial Vehicles (UAVs): A Comprehensive Survey","volume":"27","author":"Shukla","year":"2025","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"51951","DOI":"10.1109\/JIOT.2025.3618483","article-title":"Toward a Sustainable Low-Altitude Economy: A Survey of Energy-Efficient RIS-UAV Networks","volume":"12","author":"Ahmed","year":"2025","journal-title":"IEEE Internet Things J."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1109\/COMST.2024.3417336","article-title":"On the Ground and in the Sky: A Tutorial on Radio Localization in Ground-Air-Space Networks","volume":"27","author":"Sallouha","year":"2025","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3617992","article-title":"A Survey on the Unmanned Aircraft System Traffic Management","volume":"56","author":"Hamissi","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"100726","DOI":"10.1016\/j.paerosci.2021.100726","article-title":"Designing airspace for urban air mobility: A review of concepts and approaches","volume":"125","author":"Bauranov","year":"2021","journal-title":"Prog. Aerosp. Sci."},{"key":"ref_26","unstructured":"IDEA Research Institute (2025, October 12). Low-Altitude Economy Development White Paper (2.0)\u2014Full Digitalization Scheme. Available online: https:\/\/www.idea.edu.cn\/news\/7638.html."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"He, D., Yuan, W., Wu, J., and Liu, R. (2025). Ubiquitous UAV Communication Enabled Low-Altitude Economy: Applications, Techniques, and 3GPP\u2019s Efforts. IEEE Netw., 1.","DOI":"10.1109\/MNET.2025.3574922"},{"key":"ref_28","unstructured":"Brohan, A., Brown, N., Carbajal, J., Chebotar, Y., Chen, X., Choromanski, K., Ding, T., Driess, D., Dubey, A., and Finn, C. (2023). RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Tao, L., Zhang, H., Jing, H., Liu, Y., Yan, D., Wei, G., and Xue, X. (2025). Advancements in Vision\u2013Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques. Remote Sens., 17.","DOI":"10.3390\/rs17010162"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1007\/s44267-024-00065-8","article-title":"An Overview of Large AI Models and Their Applications","volume":"2","author":"Tu","year":"2024","journal-title":"Vis. Intell."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1007\/s10462-025-11236-4","article-title":"Parameter-Efficient Fine-Tuning in Large Language Models: A Survey of Methodologies","volume":"58","author":"Wang","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_32","first-page":"1","article-title":"Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification","volume":"63","author":"Lan","year":"2025","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yadav, K., Ramrakhya, R., Ramakrishnan, S.K., Gervet, T., Turner, J., Gokaslan, A., Maestre, N., Chang, A.X., Batra, D., and Savva, M. (2023). Habitat-Matterport 3D Semantics Dataset. arXiv.","DOI":"10.1109\/CVPR52729.2023.00477"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2788","DOI":"10.1109\/TCCN.2025.3601015","article-title":"Toward Realization of Low-Altitude Economy Networks: Core Architecture, Integrated Technologies, and Future Directions","volume":"11","author":"Wang","year":"2025","journal-title":"IEEE Trans. Cogn. Commun. Netw."},{"key":"ref_35","first-page":"1","article-title":"RemoteCLIP: A Vision Language Foundation Model for Remote Sensing","volume":"62","author":"Liu","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2024.3510781","article-title":"RS5M and GeoRSCLIP: A Large-Scale Vision- Language Dataset and a Large Vision-Language Model for Remote Sensing","volume":"62","author":"Zhang","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1109\/MGRS.2025.3560455","article-title":"Text2Earth: Unlocking text-driven remote sensing image generation with a global-scale dataset and a foundation model","volume":"13","author":"Liu","year":"2025","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_38","unstructured":"Xiong, Z., Wang, Y., Yu, W., Stewart, A.J., Zhao, J., Lehmann, N., Dujardin, T., Yuan, Z., Ghamisi, P., and Zhu, X.X. (2025). DOFA-CLIP: Multimodal Vision-Language Foundation Models for Earth Observation. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Du, S., Tang, S., Wang, W., Li, X., and Guo, R. (2023). Tree-GPT: Modular Large Language Model Expert System for Forest Remote Sensing Image Understanding and Interactive Analysis. arXiv.","DOI":"10.5194\/isprs-archives-XLVIII-1-W2-2023-1729-2023"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2025.3645032","article-title":"MS-LIP: Multiscale Semantic Information Integration With Large Language Models for Marine Prediction","volume":"63","author":"Lv","year":"2025","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_41","unstructured":"Zhang, Z., Shen, H., Zhao, T., Chen, B., Guan, Z., Wang, Y., Jia, X., Cai, Y., Shang, Y., and Yin, J. (2025). GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2024.3510781","article-title":"EarthGPT: A Universal Multimodal Large Language Model for Multisensor Image Comprehension in Remote Sensing Domain","volume":"62","author":"Zhang","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_43","unstructured":"Yao, K., Xu, N., Yang, R., Xu, Y., Gao, Z., Kitrungrotsakul, T., Ren, Y., Zhang, P., Wang, J., and Wei, N. (2025). Falcon: A Remote Sensing Vision-Language Foundation Model. arXiv."},{"key":"ref_44","unstructured":"Shabbir, A., Zumri, M., Bennamoun, M., Khan, F.S., and Khan, S. (2025). GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Zhang, W., Cai, M., Zhang, T., Li, J., Zhuang, Y., and Mao, X. (2024). EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote Sensing. arXiv.","DOI":"10.1109\/TGRS.2024.3523505"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Guo, H., Su, X., Wu, C., Du, B., Zhang, L., and Li, D. (2024, January 7\u201312). Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models. Proceedings of the IGARSS 2024\u20142024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece.","DOI":"10.1109\/IGARSS53475.2024.10640736"},{"key":"ref_47","unstructured":"Xu, W., Yu, Z., Mu, B., Wei, Z., Zhang, Y., Li, G., and Peng, M. (2025). RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent. arXiv."},{"key":"ref_48","first-page":"1","article-title":"Change-Agent: Toward Interactive Comprehensive Remote Sensing Change Interpretation and Analysis","volume":"62","author":"Liu","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_49","first-page":"1","article-title":"Forecasting of Tropospheric Delay Using AI Foundation Models in Support of Microwave Remote Sensing","volume":"62","author":"Ding","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_50","unstructured":"Li, S., Yang, W., Zhang, P., Xiao, X., Cao, D., Qin, Y., Zhang, X., Zhao, Y., and Bogdan, P. (2025). ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models. arXiv."},{"key":"ref_51","first-page":"1","article-title":"Physics-Informed Learning for Tropical Cyclone Intensity Prediction","volume":"62","author":"Wang","year":"2024","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Li, R., Tan, R.T., and Cheong, L.F. (2020, January 13\u201319). All in One Bad Weather Removal Using Architectural Search. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00324"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"106421","DOI":"10.1016\/j.envsoft.2025.106421","article-title":"Can large language models effectively reason about adverse weather conditions?","volume":"188","author":"Zafarmomen","year":"2025","journal-title":"Environ. Model. Softw."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Chen, S., Long, G., Jiang, J., and Zhang, C. (2024). Personalized Adapter for Large Meteorology Model on Devices: Towards Weather Foundation Models. arXiv.","DOI":"10.52202\/079017-2696"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Chen, J., Zhou, P., Hua, Y., Chong, D., Cao, M., Li, Y., Chen, W., Zhu, B., Liang, J., and Yuan, Z. (2025, January 3\u20137). ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis. Proceedings of the KDD \u201925: 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2, New York, NY, USA.","DOI":"10.1145\/3711896.3737406"},{"key":"ref_56","unstructured":"Tang, S., Xu, J., Zhang, J., Chen, Y., Jin, Q., Shen, L., Liu, C., and Xiang, S. (2025). MeteorPred: A Meteorological Multimodal Large Model and Dataset for Severe Weather Event Prediction. arXiv."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Jose Valanarasu, J.M., Yasarla, R., and Patel, V.M. (2022, January 18\u201324). TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00239"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Chen, W.T., Huang, Z.K., Tsai, C.C., Yang, H.H., Ding, J.J., and Kuo, S.Y. (2022, January 19\u201324). Learning Multiple Adverse Weather Removal via Two-stage Knowledge Learning and Multi-contrastive Regularization: Toward a Unified Model. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01713"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"1683","DOI":"10.1109\/TIP.2024.3368961","article-title":"Exploring the Application of Large-Scale Pre-Trained Models on Adverse Weather Removal","volume":"33","author":"Tan","year":"2024","journal-title":"IEEE Trans. Image Process."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Mi, W., Chen, H., and Liu, W. (2024, January 2\u20133). Hierarchical Interpretable Vision Reasoning Driven Through a Multi-Modal Large Language Model for Depth Estimation. Proceedings of the 2024 China Automation Congress (CAC), Qingdao, China.","DOI":"10.1109\/CAC63892.2024.10865484"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Liu, H., Yang, S., Long, C., Yuan, J., Yang, Q., Fan, J., Meng, B., Chen, Z., Xu, F., and Mou, C. (2025). Urban Greening Analysis: A Multimodal Large Language Model for Pinpointing Vegetation Areas in Adverse Weather Conditions. Remote Sens., 17.","DOI":"10.3390\/rs17122058"},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1109\/MCOM.005.2100142","article-title":"Toward 6G with Connected Sky: UAVs and Beyond","volume":"59","author":"Mozaffari","year":"2021","journal-title":"IEEE Commun. Mag."},{"key":"ref_63","unstructured":"International Telecommunication Union (2025, October 12). Framework and Overall Objectives of the Future Development of IMT for 2030 and Beyond; Recommendation itu-r m.2160-0; International Telecommunication Union Radiocommunication Sector (ITU-R): Geneva, Switzerland, 2023. Available online: https:\/\/www.itu.int\/pub\/R-REC-M.2160."},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Kuckreja, K., Danish, M.S., Naseer, M., Das, A., Khan, S., and Khan, F.S. (2024, January 16\u201322). GeoChat:Grounded Large Vision-Language Model for Remote Sensing. Proceedings of the 2024 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.02629"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Koubaa, A., Ammar, A., Abdelkader, M., Alhabashi, Y., and Ghouti, L. (2023). AERO: AI-Enabled Remote Sensing Observation with Onboard Edge Computing in UAVs. Remote Sens., 15.","DOI":"10.3390\/rs15071873"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2023, January 18\u201322). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1109\/MWC.001.2000292","article-title":"Edge Intelligence for Autonomous Driving in 6G Wireless System: Design Challenges and Solutions","volume":"28","author":"Yang","year":"2021","journal-title":"IEEE Wirel. Commun."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Sun, G., Xie, W., Niyato, D., Du, H., Kang, J., Wu, J., Sun, S., and Zhang, P. (2024). Generative AI for Advanced UAV Networking. arXiv.","DOI":"10.1109\/MNET.2024.3494862"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Odat, E., Ghazzai, H., and Alsharoa, A. (2024). A WaveGAN Approach for mmWave-Based FANET Topology Optimization. Sensors, 24.","DOI":"10.3390\/s24010006"},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"6592","DOI":"10.1109\/TVT.2020.2984624","article-title":"Beyond D2D: Full Dimension UAV-to-Everything Communications in 6G","volume":"69","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Rashid, M.T., Zhang, D.Y., and Wang, D. (2020, January 6\u20139). SocialDrone: An Integrated Social Media and Drone Sensing System for Reliable Disaster Response. Proceedings of the IEEE INFOCOM 2020\u2014IEEE Conference on Computer Communications, Virtual.","DOI":"10.1109\/INFOCOM41043.2020.9155522"},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"1655","DOI":"10.32604\/iasc.2023.039057","article-title":"3D Model Construction and Ecological Environment Investigation on a Regional Scale Using UAV Remote Sensing","volume":"37","author":"Chen","year":"2023","journal-title":"Intell. Autom. Soft Comput."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1109\/MWC.01.1900545","article-title":"UAV-Assisted Attack Prevention, Detection, and Recovery of 5G Networks","volume":"27","author":"Abdalla","year":"2020","journal-title":"IEEE Wirel. Commun."},{"key":"ref_74","doi-asserted-by":"crossref","first-page":"15435","DOI":"10.1109\/JIOT.2022.3176400","article-title":"A Survey on the Convergence of Edge Computing and AI for UAVs: Opportunities and Challenges","volume":"9","author":"McEnroe","year":"2022","journal-title":"IEEE Internet Things J."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1109\/MVT.2023.3323757","article-title":"Sparks of Generative Pretrained Transformers in Edge Intelligence for the Metaverse: Caching and Inference for Mobile Artificial Intelligence-Generated Content Services","volume":"18","author":"Xu","year":"2023","journal-title":"IEEE Veh. Technol. Mag."},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1109\/TPAMI.2018.2798607","article-title":"Multimodal Machine Learning: A Survey and Taxonomy","volume":"41","author":"Ahuja","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_77","unstructured":"Team, G., Anil, R., Borgeaud, S., Alayrac, J.B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A.M., Hauth, A., and Millican, K. (2025). Gemini: A Family of Highly Capable Multimodal Models. arXiv."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"1304","DOI":"10.1109\/COMST.2022.3171135","article-title":"What Will the Future of UAV Cellular Communications Be? A Flight From 5G to 6G","volume":"24","author":"Geraci","year":"2022","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"9164","DOI":"10.1109\/JIOT.2021.3056569","article-title":"Computation Offloading in LEO Satellite Networks With Hybrid Cloud and Edge Computing","volume":"8","author":"Tang","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref_80","unstructured":"Rajatheva, N., Atzeni, I., Bjornson, E., Bourdoux, A., Buzzi, S., Dore, J.B., Erkucuk, S., Fuentes, M., Guan, K., and Hu, Y. (2020). White Paper on Broadband Connectivity in 6G. arXiv."},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1109\/MWC.013.2300485","article-title":"Generative AI for Integrated Sensing and Communication: Insights From the Physical Layer Perspective","volume":"31","author":"Wang","year":"2024","journal-title":"IEEE Wirel. Commun."},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"8582","DOI":"10.1109\/TMC.2024.3350886","article-title":"Joint Task Offloading and Resource Allocation in Aerial-Terrestrial UAV Networks With Edge and Fog Computing for Post-Disaster Rescue","volume":"23","author":"Sun","year":"2024","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_83","doi-asserted-by":"crossref","unstructured":"Yang, B., Cao, X., Yuen, C., and Qian, L. (2020). Offloading Optimization in Edge Computing for Deep Learning Enabled Target Tracking by Internet-of-UAVs. arXiv.","DOI":"10.1109\/JIOT.2020.3016694"},{"key":"ref_84","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1109\/MWC.015.2300404","article-title":"Big AI Models for 6G Wireless Networks: Opportunities, Challenges, and Research Directions","volume":"31","author":"Chen","year":"2024","journal-title":"Wirel. Commun."},{"key":"ref_85","doi-asserted-by":"crossref","first-page":"109382","DOI":"10.1016\/j.comnet.2022.109382","article-title":"AR-GAIL: Adaptive routing protocol for FANETs using generative adversarial imitation learning","volume":"218","author":"Liu","year":"2022","journal-title":"Comput. Netw."},{"key":"ref_86","doi-asserted-by":"crossref","first-page":"9417","DOI":"10.1109\/TWC.2022.3176480","article-title":"Generative Neural Network Channel Modeling for Millimeter-Wave UAV Communication","volume":"21","author":"Xia","year":"2022","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1109\/MNET.2023.3321538","article-title":"AI-Generated Network Design: A Diffusion Model-Based Learning Approach","volume":"38","author":"Huang","year":"2024","journal-title":"IEEE Netw."},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/JSAC.2024.3460078","article-title":"Large Models for Aerial Edges: An Edge-Cloud Model Evolution and Communication Paradigm","volume":"43","author":"Zhang","year":"2025","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_89","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1109\/TWC.2024.3497923","article-title":"Beyond the Cloud: Edge Inference for Generative Large Language Models in Wireless Networks","volume":"24","author":"Zhang","year":"2025","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_90","doi-asserted-by":"crossref","first-page":"111792","DOI":"10.1109\/ACCESS.2021.3103041","article-title":"Artificial Intelligence for Enhanced Mobility and 5G Connectivity in UAV-Based Critical Missions","volume":"9","author":"Lins","year":"2021","journal-title":"IEEE Access"},{"key":"ref_91","doi-asserted-by":"crossref","first-page":"4147","DOI":"10.1109\/TCSVT.2021.3104305","article-title":"Digital Retina: A Way to Make the City Brain More Efficient by Visual Coding","volume":"31","author":"Gao","year":"2021","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1109\/TWC.2020.3027624","article-title":"Predictive Deployment of UAV Base Stations in Wireless Networks: Machine Learning Meets Contract Theory","volume":"20","author":"Zhang","year":"2021","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_93","doi-asserted-by":"crossref","first-page":"5401","DOI":"10.1109\/TCOMM.2022.3184160","article-title":"Secure and Energy-Efficient UAV Relay Communications Exploiting Collaborative Beamforming","volume":"70","author":"Sun","year":"2022","journal-title":"IEEE Trans. Commun."},{"key":"ref_94","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1109\/MWC.121.2100041","article-title":"Robust Edge Computing in UAV Systems via Scalable Computing and Cooperative Computing","volume":"28","author":"Liu","year":"2021","journal-title":"IEEE Wirel. Commun."},{"key":"ref_95","unstructured":"Lynch, C., Khansari, M., Xiao, T., Kumar, V., Tompson, J., Levine, S., and Sermanet, P. (November, January 30). Learning Latent Plans from Play. Proceedings of the Conference on Robot Learning, Osaka, Japan."},{"key":"ref_96","doi-asserted-by":"crossref","unstructured":"Lynch, C., and Sermanet, P. (2021). Language Conditioned Imitation Learning over Unstructured Data. arXiv.","DOI":"10.15607\/RSS.2021.XVII.047"},{"key":"ref_97","unstructured":"Palo, N.D., Byravan, A., Hasenclever, L., Wulfmeier, M., Heess, N., and Riedmiller, M. (2023). Towards A Unified Agent with Foundation Models. arXiv."},{"key":"ref_98","unstructured":"Du, Y., Watkins, O., Wang, Z., Colas, C., Darrell, T., Abbeel, P., Gupta, A., and Andreas, J. (2023, January 23\u201329). Guiding Pretraining in Reinforcement Learning with Large Language Models. Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA."},{"key":"ref_99","unstructured":"Ahn, M., Brohan, A., Brown, N., Chebotar, Y., Cortes, O., David, B., Finn, C., Gopalakrishnan, K., Hausman, K., and Herzog, A. (2022, January 14\u201318). Do As I Can, Not As I Say: Grounding Language in Robotic Affordances. Proceedings of the Conference on Robot Learning, Auckland, New Zealand."},{"key":"ref_100","unstructured":"Huang, W., Wang, C., Zhang, R., Li, Y., Wu, J., and Fei-Fei, L. (2023). VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models. arXiv."},{"key":"ref_101","unstructured":"Zhao, H., Pan, F., Ping, H., and Zhou, Y. (2023). Agent as Cerebrum, Controller as Cerebellum: Implementing an Embodied LMM-based Agent on Drones. arXiv."},{"key":"ref_102","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1177\/02783649241281508","article-title":"Foundation models in robotics: Applications, challenges, and the future","volume":"44","author":"Firoozi","year":"2025","journal-title":"Int. J. Robot. Res."},{"key":"ref_103","first-page":"52","article-title":"Large language models for robotics: Opportunities, challenges, and perspectives","volume":"4","author":"Wang","year":"2025","journal-title":"J. Autom. Intell."},{"key":"ref_104","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_105","doi-asserted-by":"crossref","unstructured":"He, K., Chen, X., Xie, S., Li, Y., Doll\u00e1r, P., and Girshick, R. (2022, January 18\u201324). Masked Autoencoders Are Scalable Vision Learners. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01553"},{"key":"ref_106","doi-asserted-by":"crossref","unstructured":"Li, T., Chang, H., Mishra, S.K., Zhang, H., Katabi, D., and Krishnan, D. (2023, January 18\u201322). MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00213"},{"key":"ref_107","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv."},{"key":"ref_108","unstructured":"Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., and Shum, H.Y. (2022). DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. arXiv."},{"key":"ref_109","unstructured":"Cheng, Y., Li, L., Xu, Y., Li, X., Yang, Z., Wang, W., and Yang, Y. (2023). Segment and Track Anything. arXiv."},{"key":"ref_110","doi-asserted-by":"crossref","unstructured":"Wu, Y., Wang, X., Yang, X., Liu, M., Zeng, D., Ye, H., and Li, S. (2025, January 11\u201315). Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking. Proceedings of the 2025 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR52734.2025.01594"},{"key":"ref_111","doi-asserted-by":"crossref","first-page":"129445","DOI":"10.1016\/j.eswa.2025.129445","article-title":"Learning motion blur robust vision transformers for real-time UAV tracking","volume":"297","author":"Wu","year":"2026","journal-title":"Expert Syst. Appl."},{"key":"ref_112","unstructured":"Li, Y., Liu, M., Wu, Y., Wang, X., Yang, X., and Li, S. (2024, January 21\u201327). Learning adaptive and view-invariant vision transformer for real-time UAV tracking. Proceedings of the ICML\u201924: 41st International Conference on Machine Learning, Vienna, Austria."},{"key":"ref_113","unstructured":"Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., and El-Nouby, A. (2024). DINOv2: Learning Robust Visual Features without Supervision. arXiv."},{"key":"ref_114","unstructured":"Tong, Z., Song, Y., Wang, J., and Wang, L. (December, January 28). VideoMAE: Masked autoencoders are data-efficient learners for self-supervised video pre-training. Proceedings of the NIPS \u201922: 36th International Conference on Neural Information Processing Systems, Red Hook, NY, USA."},{"key":"ref_115","doi-asserted-by":"crossref","unstructured":"Wang, L., Huang, B., Zhao, Z., Tong, Z., He, Y., Wang, Y., Wang, Y., and Qiao, Y. (2023). VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking. arXiv.","DOI":"10.1109\/CVPR52729.2023.01398"},{"key":"ref_116","doi-asserted-by":"crossref","unstructured":"Cui, J., Liu, G., Wang, H., Yu, Y., and Yang, J. (2024, January 18\u201321). TPML: Task Planning for Multi-UAV System with Large Language Models. Proceedings of the 2024 IEEE 18th International Conference on Control & Automation (ICCA), Reykjav\u00edk, Iceland.","DOI":"10.1109\/ICCA62789.2024.10591846"},{"key":"ref_117","doi-asserted-by":"crossref","unstructured":"Liang, J., Huang, W., Xia, F., Xu, P., Hausman, K., Ichter, B., Florence, P., and Zeng, A. (June, January 29). Code as Policies: Language Model Programs for Embodied Control. Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK.","DOI":"10.1109\/ICRA48891.2023.10160591"},{"key":"ref_118","unstructured":"Huang, W., Abbeel, P., Pathak, D., and Mordatch, I. (2022). Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents. arXiv."},{"key":"ref_119","doi-asserted-by":"crossref","unstructured":"Ding, Y., Zhang, X., Paxton, C., and Zhang, S. (2023, January 1\u20135). Task and Motion Planning with Large Language Models for Object Rearrangement. Proceedings of the 2023 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA.","DOI":"10.1109\/IROS55552.2023.10342169"},{"key":"ref_120","unstructured":"Jiao, A., Patel, T.P., Khurana, S., Korol, A.M., Brunke, L., Adajania, V.K., Culha, U., Zhou, S., and Schoellig, A.P. (2023). Swarm-GPT: Combining Large Language Models with Safe Motion Planning for Robot Choreography Design. arXiv."},{"key":"ref_121","doi-asserted-by":"crossref","first-page":"55682","DOI":"10.1109\/ACCESS.2024.3387941","article-title":"ChatGPT for Robotics: Design Principles and Model Abilities","volume":"12","author":"Vemprala","year":"2024","journal-title":"IEEE Access"},{"key":"ref_122","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021). Learning Transferable Visual Models From Natural Language Supervision. arXiv."},{"key":"ref_123","unstructured":"Jia, C., Yang, Y., Xia, Y., Chen, Y.T., Parekh, Z., Pham, H., Le, Q.V., Sung, Y., Li, Z., and Duerig, T. (2021). Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision. arXiv."},{"key":"ref_124","unstructured":"Fan, L., Krishnan, D., Isola, P., Katabi, D., and Tian, Y. (2023). Improving CLIP Training with Language Rewrites. arXiv."},{"key":"ref_125","doi-asserted-by":"crossref","unstructured":"Li, L.H., Zhang, P., Zhang, H., Yang, J., Li, C., Zhong, Y., Wang, L., Yuan, L., Zhang, L., and Hwang, J.N. (2022, January 18\u201324). Grounded Language-Image Pre-training. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01069"},{"key":"ref_126","doi-asserted-by":"crossref","unstructured":"Yuan, Z., Xie, F., and Ji, T. (2024, January 20\u201322). Patrol Agent: An Autonomous UAV Framework for Urban Patrol Using on Board Vision Language Model and on Cloud Large Language Model. Proceedings of the 2024 International Conference on Robotics and Computer Vision (ICRCV), Wuxi, China.","DOI":"10.1109\/ICRCV62709.2024.10758606"},{"key":"ref_127","unstructured":"Li, J., Li, D., Xiong, C., and Hoi, S. (2022). BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. arXiv."},{"key":"ref_128","unstructured":"Bai, J., Bai, S., Yang, S., Wang, S., Tan, S., Wang, P., Lin, J., Zhou, C., and Zhou, J. (2023). Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond. arXiv."},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, Y., Wang, C., Zhong, Z., Chen, Y., Chu, R., Liu, S., and Jia, J. (2024). Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models. arXiv.","DOI":"10.1109\/TPAMI.2025.3637265"},{"key":"ref_130","unstructured":"Han, J., Chen, H., Zhao, Y., Wang, H., Zhao, Q., Yang, Z., He, H., Yue, X., and Jiang, L. (2025). Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations. arXiv."},{"key":"ref_131","unstructured":"Team, C. (2025). Chameleon: Mixed-Modal Early-Fusion Foundation Models. arXiv."},{"key":"ref_132","unstructured":"Li, T., Lu, Q., Zhao, L., Li, H., Zhu, X., Qiao, Y., Zhang, J., and Shao, W. (2025). UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation. arXiv."},{"key":"ref_133","unstructured":"Song, W., Wang, Y., Song, Z., Li, Y., Sun, H., Chen, W., Zhou, Z., Xu, J., Wang, J., and Yu, K. (2025). DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies. arXiv."},{"key":"ref_134","doi-asserted-by":"crossref","unstructured":"Li, Y., Wang, C., and Jia, J. (2023). LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models. arXiv.","DOI":"10.1007\/978-3-031-72952-2_19"},{"key":"ref_135","doi-asserted-by":"crossref","unstructured":"Wang, Z., Yu, S., Stengel-Eskin, E., Yoon, J., Cheng, F., Bertasius, G., and Bansal, M. (2025). VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos. arXiv.","DOI":"10.1109\/CVPR52734.2025.00311"},{"key":"ref_136","unstructured":"Xu, G., Jin, P., Wu, Z., Li, H., Song, Y., Sun, L., and Yuan, L. (2025). LLaVA-CoT: Let Vision Language Models Reason Step-by-Step. arXiv."},{"key":"ref_137","doi-asserted-by":"crossref","unstructured":"Ke, F., Cai, Z., Jahangard, S., Wang, W., Haghighi, P.D., and Rezatofighi, H. (2024). HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning. arXiv.","DOI":"10.1007\/978-3-031-72661-3_8"},{"key":"ref_138","doi-asserted-by":"crossref","unstructured":"Gupta, T., and Kembhavi, A. (2023, January 18\u201322). Visual Programming: Compositional visual reasoning without training. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01436"},{"key":"ref_139","unstructured":"Liu, H., Li, C., Wu, Q., and Lee, Y.J. (2023). Visual Instruction Tuning. arXiv."},{"key":"ref_140","doi-asserted-by":"crossref","unstructured":"Liu, H., Li, C., Li, Y., and Lee, Y.J. (2024, January 16\u201322). Improved Baselines with Visual Instruction Tuning. Proceedings of the 2024 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.02484"},{"key":"ref_141","unstructured":"Lin, B., Tang, Z., Ye, Y., Huang, J., Zhang, J., Pang, Y., Jin, P., Ning, M., Luo, J., and Yuan, L. (2023). MoE-LLaVA: Mixture of Experts for Large Vision-Language Models, 2024. arXiv."},{"key":"ref_142","doi-asserted-by":"crossref","first-page":"35890","DOI":"10.1109\/JIOT.2025.3579780","article-title":"Generative AI-Driven Multiagent DRL for Task Allocation in UAV-Assisted EMPD Within 6G-Enabled SAGIN Networks","volume":"12","author":"Ullah","year":"2025","journal-title":"IEEE Internet Things J."},{"key":"ref_143","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1109\/TMC.2025.3594188","article-title":"Task Assignment and Exploration Optimization for Low Altitude UAV Rescue via Generative AI Enhanced Multi-agent Reinforcement Learning","volume":"25","author":"Tang","year":"2025","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_144","doi-asserted-by":"crossref","first-page":"48336","DOI":"10.1109\/JIOT.2025.3605692","article-title":"Joint Task Offloading and Resource Allocation in UAV-Assisted MEC Networks for Disaster Rescue: A Large AI Model Enabled DRL Approach","volume":"12","author":"Zhang","year":"2025","journal-title":"IEEE Internet Things J."},{"key":"ref_145","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1109\/TWC.2024.3497593","article-title":"Dynamic UAV-Assisted Cooperative Edge AI Inference","volume":"24","author":"Huang","year":"2025","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_146","doi-asserted-by":"crossref","unstructured":"Haque, E., Hasan, K., Ahmed, I., Alam, M.S., and Islam, T. (2024, January 19\u201322). Enhancing UAV Security Through Zero Trust Architecture: An Advanced Deep Learning and Explainable AI Analysis. Proceedings of the 2024 International Conference on Computing, Networking and Communications (ICNC), Big Island, HI, USA.","DOI":"10.1109\/ICNC59896.2024.10556279"},{"key":"ref_147","doi-asserted-by":"crossref","first-page":"23379","DOI":"10.1109\/JIOT.2022.3206276","article-title":"Adversarial Attacks and Defenses Toward AI-Assisted UAV Infrastructure Inspection","volume":"9","author":"Raja","year":"2022","journal-title":"IEEE Internet Things J."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/10\/1\/33\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T09:02:54Z","timestamp":1768554174000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/10\/1\/33"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,16]]},"references-count":147,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,1]]}},"alternative-id":["bdcc10010033"],"URL":"https:\/\/doi.org\/10.3390\/bdcc10010033","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,16]]}}}