{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:17:20Z","timestamp":1766067440477,"version":"build-2065373602"},"reference-count":38,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2021,12,14]],"date-time":"2021-12-14T00:00:00Z","timestamp":1639440000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Unmanned aerial vehicle (UAV) clusters usually face problems such as complex environments, heterogeneous combat subjects, and realistic interference factors in the course of mission assignment. In order to reduce resource consumption and improve the task execution rate, it is very important to develop a reasonable allocation plan for the tasks. Therefore, this paper constructs a heterogeneous UAV multitask assignment model based on several realistic constraints and proposes an improved half-random Q-learning (HR Q-learning) algorithm. The algorithm is based on the Q-learning algorithm under reinforcement learning, and by changing the way the Q-learning algorithm selects the next action in the process of random exploration, the probability of obtaining an invalid action in the random case is reduced, and the exploration efficiency is improved, thus increasing the possibility of obtaining a better assignment scheme, this also ensures symmetry and synergy in the distribution process of the drones. Simulation experiments show that compared with Q-learning algorithm and other heuristic algorithms, HR Q-learning algorithm can improve the performance of task execution, including the ability to improve the rationality of task assignment, increasing the value of gains by 12.12%, this is equivalent to an average of one drone per mission saved, and higher success rate of task execution. This improvement provides a meaningful attempt for UAV task assignment.<\/jats:p>","DOI":"10.3390\/sym13122417","type":"journal-article","created":{"date-parts":[[2021,12,14]],"date-time":"2021-12-14T22:06:10Z","timestamp":1639519570000},"page":"2417","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Multi-UAV Cooperative Task Assignment Based on Half Random Q-Learning"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6552-1662","authenticated-orcid":false,"given":"Pengxing","family":"Zhu","sequence":"first","affiliation":[{"name":"School of Science, Wuhan University of Technology, Wuhan 430070, China"}]},{"given":"Xi","family":"Fang","sequence":"additional","affiliation":[{"name":"School of Science, Wuhan University of Technology, Wuhan 430070, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1016\/j.cie.2018.04.037","article-title":"The unmanned aerial vehicle routing and trajectory optimisation problem, a taxonomic review","volume":"120","author":"Coutinho","year":"2018","journal-title":"Comput. Ind. Eng."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/j.icte.2019.09.006","article-title":"Interference modeling and analysis in 3-dimensional directional UAV networks based on stochastic geometry","volume":"5","author":"Chu","year":"2019","journal-title":"ICT Express"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"102324","DOI":"10.1016\/j.adhoc.2020.102324","article-title":"A Comprehensive Review of Unmanned Aerial Vehicle Attacks and Neutralization Techniques","volume":"111","author":"Chamola","year":"2020","journal-title":"Ad Hoc Netw."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1016\/j.cja.2017.09.005","article-title":"Multi-UAV reconnaissance task allocation for heterogeneous targets using an opposition-based genetic algorithm with double-chromosome encoding","volume":"31","author":"Wang","year":"2018","journal-title":"Chin. J. Aeronaut."},{"key":"ref_5","first-page":"154","article-title":"Analysis on MAV\/UAV cooperative combat based on complex network","volume":"16","author":"Fan","year":"2020","journal-title":"Def. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1016\/j.cie.2017.10.030","article-title":"Unmanned aerial vehicle routing in the presence of threats","volume":"115","author":"Alotaibi","year":"2018","journal-title":"Comput. Ind. Eng."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1016\/j.isatra.2020.03.004","article-title":"Potential game for dynamic task allocation in multi-agent system","volume":"102","author":"Wu","year":"2020","journal-title":"ISA Trans."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2825","DOI":"10.1016\/j.cja.2020.02.009","article-title":"Cooperative task assignment of multi-UAV system","volume":"33","author":"Jzab","year":"2020","journal-title":"Chin. J. Aeronaut."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"107067","DOI":"10.1016\/j.compeleceng.2021.107067","article-title":"Research on many-to-many target assignment for unmanned aerial vehicle swarm in three-dimensional scenarios","volume":"91","author":"Hua","year":"2021","journal-title":"Comput. Electr. Eng."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"758","DOI":"10.1016\/j.jpdc.2010.03.011","article-title":"Multi-heuristic dynamic task allocation using genetic algorithms in a heterogeneous distributed system-sciencedirect","volume":"70","author":"Page","year":"2010","journal-title":"J. Parallel Distrib. Comput."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1016\/j.isatra.2019.08.018","article-title":"Efficient path planning for uav formation via comprehensively improved particle swarm optimization","volume":"97","author":"Shao","year":"2020","journal-title":"ISA Trans."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"105826","DOI":"10.1016\/j.ast.2020.105826","article-title":"An intelligent cooperative mission planning scheme of uav swarm in uncertain dynamic environment","volume":"100","author":"Zhen","year":"2020","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"107489","DOI":"10.1016\/j.cie.2021.107489","article-title":"Dynamic multi-objective scheduling for flexible job shop by deep reinforcement learning","volume":"159","author":"Shu","year":"2021","journal-title":"Comput. Ind. Eng."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1054","DOI":"10.1109\/TNN.1998.712192","article-title":"Reinforcement learning: An introduction","volume":"9","author":"Sutton","year":"1998","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Liu, R., Cui, J., and Song, Y. (2015, January 12\u201313). Forward Greedy Heuristic Algorithm for N-Vehicle Exploration Problem (NVEP). Proceedings of the 2015 8th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China.","DOI":"10.1109\/ISCID.2015.133"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Tan, Z., and Karakose, M. (November, January 12). Optimized Deep Reinforcement Learning Approach for Dynamic System. Proceedings of the 2020 IEEE International Symposium on Systems Engineering (ISSE), Vienna, Austria.","DOI":"10.1109\/ISSE49799.2020.9272245"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"19306","DOI":"10.1109\/ACCESS.2020.2967061","article-title":"Task Allocation for Multi-Agent Systems Based on Distributed Many-Objective Evolutionary Algorithm and Greedy Algorithm","volume":"8","author":"Zhou","year":"2020","journal-title":"IEEE Access"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"106796","DOI":"10.1016\/j.asoc.2020.106796","article-title":"Optimal path planning approach based on Q-learning algorithm for mobile robots","volume":"97","author":"Maoudj","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.patrec.2020.05.006","article-title":"A PSO-based algorithm for mining association rules using a guided exploration strategy","volume":"138","author":"Rosas","year":"2020","journal-title":"Pattern Recognit. Lett."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"105643","DOI":"10.1016\/j.asoc.2019.105643","article-title":"Adaptive task allocation for multi-uav systems based on bacteria foraging behaviour","volume":"83","author":"Kurdi","year":"2019","journal-title":"Appl. Soft Comput."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"7155","DOI":"10.1007\/s00500-021-05675-8","article-title":"Multi-UAV reconnaissance task allocation for heterogeneous targets using grouping ant colony optimization algorithm","volume":"25","author":"Gao","year":"2021","journal-title":"Soft Comput."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.1016\/j.cie.2011.08.004","article-title":"Heuristic algorithms for assigning and scheduling flight missions in a military aviation unit","volume":"61","year":"2011","journal-title":"Comput. Ind. Eng."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/s11227-020-03264-4","article-title":"Decentralized task allocation for heterogeneous multi-UAV system with task coupling constraints","volume":"77","author":"Ye","year":"2020","journal-title":"J. Supercomput."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"17841","DOI":"10.1109\/ACCESS.2018.2818733","article-title":"Multi-Type UAVs Cooperative Task Allocation Under Resource Constraints","volume":"6","author":"Huang","year":"2020","journal-title":"IEEE Access"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/j.comcom.2020.01.006","article-title":"A novel mission planning method for UAVs\u2019 course of action","volume":"152","author":"Zhou","year":"2020","journal-title":"Comput. Commun."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/j.physa.2017.08.094","article-title":"Modeling and simulation of dynamic ant colony\u2019s labor division for task allocation of UAV swarm","volume":"491","author":"Wu","year":"2017","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1949","DOI":"10.1016\/j.procs.2013.05.364","article-title":"An Operation-Time Simulation Framework for UAV Swarm Configuration and Mission Planning","volume":"18","author":"Wei","year":"2013","journal-title":"Procedia Comput. Sci."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1140","DOI":"10.1126\/science.aar6404","article-title":"A general reinforcement learning algorithm that masters chess, shogi, and go through self-play","volume":"362","author":"Silver","year":"2018","journal-title":"Science"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1126\/science.aau6249","article-title":"Human-level performance in 3D multiplayer games with population-based reinforcement learning","volume":"364","author":"Aderberg","year":"2019","journal-title":"Science"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"121719","DOI":"10.1109\/ACCESS.2019.2937943","article-title":"A Middle Game Search Algorithm Applicable to Low-Cost Personal Computer for Go","volume":"7","author":"Li","year":"2019","journal-title":"IEEE Access"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1016\/j.ast.2019.06.024","article-title":"Fast task allocation for heterogeneous unmanned aerial vehicles through reinforcement learning","volume":"92","author":"Zhao","year":"2019","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.cja.2020.12.027","article-title":"Relevant experience learning: A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments","volume":"34","author":"Hu","year":"2021","journal-title":"Chin. J. Aeronaut."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Xu, J., Guo, Q., Xiao, L., Li, Z., and Zhang, G. (2019, January 20\u201322). Autonomous Decision-Making Method for Combat Mission of UAV based on Deep Reinforcement Learning. Proceedings of the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China.","DOI":"10.1109\/IAEAC47372.2019.8998066"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1016\/j.compchemeng.2019.05.029","article-title":"Reinforcement Learning\u2014Overview of recent progress and implications for process control","volume":"127","author":"Shin","year":"2019","journal-title":"Comput. Chem. Eng."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1109\/TAMD.2010.2051031","article-title":"Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective","volume":"2","author":"Singh","year":"2010","journal-title":"IEEE Trans. Auton. Ment. Dev."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1016\/j.jprocont.2018.06.002","article-title":"A Finite Horizon Markov Decision Process Based Reinforcement Learning Control of a Rapid Thermal Processing system","volume":"68","author":"John","year":"2018","journal-title":"J. Process. Control."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.cie.2017.05.026","article-title":"A reinforcement learning approach to parameter estimation in dynamic job shop scheduling","volume":"110","author":"Shahrabi","year":"2017","journal-title":"Comput. Ind. Eng."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1038\/nature14540","article-title":"Reinforcement learning improves behaviour from evaluative feedback","volume":"521","author":"Littman","year":"2015","journal-title":"Nature"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/13\/12\/2417\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:47:19Z","timestamp":1760168839000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/13\/12\/2417"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,14]]},"references-count":38,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["sym13122417"],"URL":"https:\/\/doi.org\/10.3390\/sym13122417","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2021,12,14]]}}}