{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T01:34:46Z","timestamp":1771637686580,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,2,4]],"date-time":"2021-02-04T00:00:00Z","timestamp":1612396800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Unmanned aerial vehicles (UAVs) have been widely used in search and rescue (SAR) missions due to their high flexibility. A key problem in SAR missions is to search and track moving targets in an area of interest. In this paper, we focus on the problem of Cooperative Multi-UAV Observation of Multiple Moving Targets (CMUOMMT). In contrast to the existing literature, we not only optimize the average observation rate of the discovered targets, but we also emphasize the fairness of the observation of the discovered targets and the continuous exploration of the undiscovered targets, under the assumption that the total number of targets is unknown. To achieve this objective, a deep reinforcement learning (DRL)-based method is proposed under the Partially Observable Markov Decision Process (POMDP) framework, where each UAV maintains four observation history maps, and maps from different UAVs within a communication range can be merged to enhance UAVs\u2019 awareness of the environment. A deep convolutional neural network (CNN) is used to process the merged maps and generate the control commands to UAVs. The simulation results show that our policy can enable UAVs to balance between giving the discovered targets a fair observation and exploring the search region compared with other methods.<\/jats:p>","DOI":"10.3390\/s21041076","type":"journal-article","created":{"date-parts":[[2021,2,4]],"date-time":"2021-02-04T21:29:27Z","timestamp":1612474167000},"page":"1076","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Searching and Tracking an Unknown Number of Targets: A Learning-Based Method Enhanced with Maps Merging"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2928-8987","authenticated-orcid":false,"given":"Peng","family":"Yan","sequence":"first","affiliation":[{"name":"School of Astronautics, Harbin Institute of Technology, Harbin 150001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Jia","sequence":"additional","affiliation":[{"name":"Aerospace Technology Research Institute, China Aerodynamics Research and Development Center, Mianyang 621000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chengchao","family":"Bai","sequence":"additional","affiliation":[{"name":"School of Astronautics, Harbin Institute of Technology, Harbin 150001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Queralta, J.P., Taipalmaa, J., Pullinen, B.C., Sarker, V.K., Gia, T.N., Tenhunen, H., Gabbouj, M., Raitoharju, J., and Westerlund, T. (2020). Collaborative multi-robot systems for search and rescue: Coordination and perception. arXiv.","DOI":"10.1109\/ACCESS.2020.3030190"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Mendon\u00e7a, R., Marques, M.M., Marques, F., Lourenco, A., Pinto, E., Santana, P., Coito, F., Lobo, V., and Barata, J. (2016, January 19\u201323). A cooperative multi-robot team for the surveillance of shipwreck survivors at sea. Proceedings of the OCEANS 2016 MTS\/IEEE Monterey, Monterey, CA, USA.","DOI":"10.1109\/OCEANS.2016.7761074"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1007\/s10846-018-0898-1","article-title":"A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques","volume":"95","author":"Sampedro","year":"2019","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1109\/TCYB.2016.2628161","article-title":"Cooperative robots to observe moving targets","volume":"48","author":"Khan","year":"2016","journal-title":"IEEE Trans. Cybern."},{"key":"ref_5","unstructured":"Parker, L.E., and Emmons, B.A. (1997, January 20\u201325). Cooperative multi-robot observation of multiple moving targets. Proceedings of the International Conference on Robotics and Automation, Albuquerque, NM, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1023\/A:1015256330750","article-title":"Distributed algorithms for multi-robot observation of multiple moving targets","volume":"12","author":"Parker","year":"2002","journal-title":"Auton. Robot."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"935","DOI":"10.1177\/0278364907080424","article-title":"Cooperative observation of multiple moving targets: An algorithm and its formalization","volume":"26","author":"Kolling","year":"2007","journal-title":"Int. J. Robot. Res."},{"key":"ref_8","unstructured":"Ding, Y., Zhu, M., He, Y., and Jiang, J. (2006, January 21\u201323). P-CMOMMT algorithm for the cooperative multi-robot observation of multiple moving targets. Proceedings of the 2006 6th World Congress on Intelligent Control and Automation, Dalian, China."},{"key":"ref_9","unstructured":"Peng, H., Su, F., Bu, Y., Zhang, G., and Shen, L. (2009, January 27\u201329). Cooperative area search for multiple UAVs based on RRT and decentralized receding horizon optimization. Proceedings of the 2009 7th Asian Control Conference, Hong Kong, China."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1016\/j.ast.2016.05.016","article-title":"Multi-UAVs tracking target in urban environment by model predictive control and Improved Grey Wolf Optimizer","volume":"55","author":"Yao","year":"2016","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.swevo.2018.01.002","article-title":"Chaos-enhanced mobility models for multilevel swarms of UAVs","volume":"41","author":"Rosalie","year":"2018","journal-title":"Swarm Evol. Comput."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Stolfi, D.H., Brust, M.R., Danoy, G., and Bouvry, P. (2020, January 10\u201313). A Cooperative Coevolutionary Approach to Maximise Surveillance Coverage of UAV Swarms. Proceedings of the 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA.","DOI":"10.1109\/CCNC46108.2020.9045643"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"9408","DOI":"10.3390\/s140609408","article-title":"Multi-agent cooperative target search","volume":"14","author":"Hu","year":"2014","journal-title":"Sensors"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Hayat, S., Yanmaz, E., Brown, T.X., and Bettstetter, C. (June, January 29). Multi-objective UAV path planning for search and rescue. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, Singapore.","DOI":"10.1109\/ICRA.2017.7989656"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chen, H.X., Nan, Y., and Yang, Y. (2019). Multi-UAV Reconnaissance task assignment for heterogeneous targets based on modified symbiotic organisms search algorithm. Sensors, 19.","DOI":"10.3390\/s19030734"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1016\/j.ast.2018.10.017","article-title":"Fuzzy multiobjective cooperative surveillance of multiple UAVs based on distributed predictive control for unknown ground moving target in urban environment","volume":"84","author":"Hu","year":"2019","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"De Alcantara Andrade, F.A., Reinier Hovenburg, A., Netto de Lima, L., Dahlin Rodin, C., Johansen, T.A., Storvold, R., Moraes Correia, C.A., and Barreto Haddad, D. (2019). Autonomous unmanned aerial vehicles in search and rescue missions using real-time cooperative model predictive control. Sensors, 19.","DOI":"10.3390\/s19194067"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1007\/s10514-018-9735-4","article-title":"An integer linear programming model for fair multitarget tracking in cooperative multirobot systems","volume":"43","author":"Banfi","year":"2019","journal-title":"Auton. Robot."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Li, X., Chen, J., Deng, F., and Li, H. (2019). Profit-driven adaptive moving targets search with UAV swarms. Sensors, 19.","DOI":"10.3390\/s19071545"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1007\/s10514-019-09840-9","article-title":"Distributed multi-target search and tracking using the PHD filter","volume":"44","author":"Dames","year":"2020","journal-title":"Auton. Robot."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3826","DOI":"10.1109\/TCYB.2020.2977374","article-title":"Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications","volume":"50","author":"Nguyen","year":"2020","journal-title":"IEEE Trans. Cybern."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/nature24270","article-title":"Mastering the game of go without human knowledge","volume":"550","author":"Silver","year":"2017","journal-title":"Nature"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Walker, O., Vanegas, F., and Gonzalez, F. (2020). A framework for multi-agent UAV exploration and target-finding in GPS-denied and partially observable environments. Sensors, 20.","DOI":"10.3390\/s20174739"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Bhagat, S., and Sujit, P. (2020, January 1\u20134). UAV Target Tracking in Urban Environments Using Deep Reinforcement Learning. Proceedings of the 2020 International Conference on Unmanned Aircraft Systems (ICUAS), Athens, Greece.","DOI":"10.1109\/ICUAS48674.2020.9213856"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1109\/LRA.2019.2891991","article-title":"Deep reinforcement learning robot for search and rescue applications: Exploration in unknown cluttered environments","volume":"4","author":"Niroui","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Sharif Razavian, A., Azizpour, H., Sullivan, J., and Carlsson, S. (2014, January 23\u201328). CNN features off-the-shelf: An astounding baseline for recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, Ohio.","DOI":"10.1109\/CVPRW.2014.131"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2064","DOI":"10.1109\/TNNLS.2019.2927869","article-title":"Deep reinforcement learning-based automatic exploration for navigation in unknown environment","volume":"31","author":"Li","year":"2020","journal-title":"IEEE Trans. Neural Networks Learn. Syst."},{"key":"ref_29","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel."},{"key":"ref_30","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Long, P., Fanl, T., Liao, X., Liu, W., Zhang, H., and Pan, J. (2018, January 21\u201325). Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8461113"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Yan, P., Bai, C., Zheng, H., and Guo, J. (2020, January 27\u201328). Flocking Control of UAV Swarms with Deep Reinforcement Leaming Approach. Proceedings of the 3rd International Conference on Unmanned Systems (ICUS), Harbin, China.","DOI":"10.1109\/ICUS50048.2020.9274899"},{"key":"ref_33","unstructured":"Kingma, D.P., and Ba, J. (2015, January 7\u20139). Adam: A method for stochastic optimization. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_34","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). Pytorch: An imperative style, high-performance deep learning library. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1076\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:20:18Z","timestamp":1760160018000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1076"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,4]]},"references-count":34,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["s21041076"],"URL":"https:\/\/doi.org\/10.3390\/s21041076","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,4]]}}}