{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T16:13:59Z","timestamp":1776183239007,"version":"3.50.1"},"reference-count":62,"publisher":"MDPI AG","issue":"20","license":[{"start":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T00:00:00Z","timestamp":1697673600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the National Natural Science Foundation of China","award":["62271166"],"award-info":[{"award-number":["62271166"]}]},{"name":"the National Natural Science Foundation of China","award":["IR2021104"],"award-info":[{"award-number":["IR2021104"]}]},{"name":"Interdisciplinary Research Foundation of HIT","award":["62271166"],"award-info":[{"award-number":["62271166"]}]},{"name":"Interdisciplinary Research Foundation of HIT","award":["IR2021104"],"award-info":[{"award-number":["IR2021104"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>A discernible gap has materialized between the expectations for object detection tasks in optical remote sensing images and the increasingly sophisticated design methods. The flexibility of deep learning object detection algorithms allows the selection and combination of multiple basic structures and model sizes, but this selection process relies heavily on human experience and lacks reliability when faced with special scenarios or extreme data distribution. To address these inherent challenges, this study proposes an approach that leverages deep reinforcement learning within the framework of vision tasks. This study introduces a Task-Risk Consistent Intelligent Detection Framework (TRC-ODF) for object detection in optical remote sensing images. The proposed framework designs a model optimization strategy based on deep reinforcement learning that systematically integrates the available information from images and vision processes. The core of the reinforcement learning agent is the proposed task-risk consistency reward mechanism, which is the driving force behind the optimal prediction allocation in the decision-making process. To verify the effectiveness of the proposed framework, multiple sets of empirical evaluations are conducted on representative optical remote sensing image datasets: RSOD, NWPU VHR-10, and DIOR. When applying the proposed framework to representative advanced detection models, the mean average precision (mAP@0.5 and mAP@0.5:0.95) is improved by 0.8\u20135.4 and 0.4\u20132.7, respectively. The obtained results showcase the considerable promise and potential of the TRC-ODF framework to address the challenges associated with object detection in optical remote sensing images.<\/jats:p>","DOI":"10.3390\/rs15205031","type":"journal-article","created":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T11:46:26Z","timestamp":1697715986000},"page":"5031","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Task-Risk Consistency Object Detection Framework Based on Deep Reinforcement Learning"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7387-4970","authenticated-orcid":false,"given":"Jiazheng","family":"Wen","sequence":"first","affiliation":[{"name":"Faculty of Computing, Harbin Institute of Technology, Harbin 150080, China"}]},{"given":"Huanyu","family":"Liu","sequence":"additional","affiliation":[{"name":"Faculty of Computing, Harbin Institute of Technology, Harbin 150080, China"}]},{"given":"Junbao","family":"Li","sequence":"additional","affiliation":[{"name":"Faculty of Computing, Harbin Institute of Technology, Harbin 150080, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.isprsjprs.2016.03.014","article-title":"A survey on object detection in optical remote sensing images","volume":"117","author":"Cheng","year":"2016","journal-title":"ISPRS J. Photogramm. Remote. Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"5553","DOI":"10.1109\/TGRS.2016.2569141","article-title":"Weakly Supervised Learning Based on Coupled Convolutional Neural Networks for Aircraft Detection","volume":"54","author":"Zhang","year":"2016","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Aposporis, P. (2020, January 7\u201310). Object Detection Methods for Improving UAV Autonomy and Remote Sensing Applications. Proceedings of the 2020 IEEE\/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), The Hague, The Netherlands.","DOI":"10.1109\/ASONAM49781.2020.9381377"},{"key":"ref_4","unstructured":"Barrett, E.C. (1999). Introduction to Environmental Remote Sensing, Routledge."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Murayama, Y., Kamusoko, C., Yamashita, A., and Estoque, R.C. (2017). Urban Development in Asia and Africa: Geospatial Analysis of Metropolises, Springer.","DOI":"10.1007\/978-981-10-3241-7"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1016\/j.isprsjprs.2019.11.023","article-title":"Object detection in optical remote sensing images: A survey and a new benchmark","volume":"159","author":"Li","year":"2020","journal-title":"ISPRS J. Photogramm. Remote. Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2023.3321956","article-title":"Instance-Aware Distillation for Efficient Object Detection in Remote Sensing Images","volume":"61","author":"Li","year":"2023","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1016\/j.isprsjprs.2022.12.004","article-title":"Generalized few-shot object detection in remote sensing images","volume":"195","author":"Zhang","year":"2023","journal-title":"ISPRS J. Photogramm. Remote. Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"119132","DOI":"10.1016\/j.eswa.2022.119132","article-title":"Info-FPN: An Informative Feature Pyramid Network for object detection in remote sensing images","volume":"214","author":"Chen","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhou, H., Ma, A., Niu, Y., and Ma, Z. (2022). Small-Object Detection for UAV-Based Images Using a Distance Metric Method. Drones, 6.","DOI":"10.3390\/drones6100308"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liu, H., Yu, Y., Liu, S., and Wang, W. (2022). A Military Object Detection Model of UAV Reconnaissance Image and Feature Visualization. Appl. Sci., 12.","DOI":"10.3390\/app122312236"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Kreutzer, J., Khadivi, S., Matusov, E., and Riezler, S. (2018). Can Neural Machine Translation be Improved with User Feedback?. arXiv.","DOI":"10.18653\/v1\/N18-3012"},{"key":"ref_13","first-page":"3008","article-title":"Learning to summarize with human feedback","volume":"33","author":"Stiennon","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_14","unstructured":"Pinto, A.S., Kolesnikov, A., Shi, Y., Beyer, L., and Zhai, X. (2023). Tuning computer vision models with task rewards. arXiv."},{"key":"ref_15","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume":"35","author":"Ouyang","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","first-page":"1","article-title":"Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing","volume":"55","author":"Liu","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Uzkent, B., Yeh, C., and Ermon, S. (2020, January 1\u20135). Efficient Object Detection in Large Images Using Deep Reinforcement Learning. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA.","DOI":"10.1109\/WACV45572.2020.9093447"},{"key":"ref_18","first-page":"29","article-title":"Tree-Structured Reinforcement Learning for Sequential Object Localization","volume":"29","author":"Jie","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1109\/TCDS.2018.2885813","article-title":"Multitask Learning for Object Localization With Deep Reinforcement Learning","volume":"11","author":"Wang","year":"2019","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Pirinen, A., and Sminchisescu, C. (2018, January 18\u201323). Deep Reinforcement Learning of Region Proposal Networks for Object Detection. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00726"},{"key":"ref_21","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Yan, J., Lei, Z., Wen, L., and Li, S.Z. (2014, January 23\u201328). The Fastest Deformable Part Model for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.320"},{"key":"ref_24","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Adv. Neural Inf. Process. Syst., 28."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade R-CNN: Delving Into High Quality Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Leibe, B., Matas, J., Sebe, N., and Welling, M. (2016, January 11\u201325). SSD: Single Shot MultiBox Detector. Proceedings of the Computer Vision\u2014ECCV 2016: 14th European Conference, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46454-1"},{"key":"ref_27","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2019, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy."},{"key":"ref_28","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). FCOS: Fully Convolutional One-Stage Object Detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_29","unstructured":"Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. (November, January 27). CenterNet: Keypoint Triplets for Object Detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_30","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding YOLO Series in 2021. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_33","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_34","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. 2020. arXiv."},{"key":"ref_35","unstructured":"Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., NanoCode012, Kwon, Y., Michael, K., Fang, J. (2020, May 15). ultralytics\/yolov5: v7.0\u2014YOLOv5 SOTA Realtime Instance Segmentation. Zenodo, 2022. Available online: https:\/\/zenodo.org\/records\/7347926."},{"key":"ref_36","unstructured":"Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., and Nie, W. (2022). YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. 2022. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2023, January 18\u201322). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"ref_38","unstructured":"Jocher, G., Chaurasia, A., and Qiu, J. (2023, March 12). YOLO by Ultralytics. Available online: https:\/\/github.com\/ultralytics\/ultralytics."},{"key":"ref_39","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2010). Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/MSP.2017.2743240","article-title":"Deep Reinforcement Learning: A Brief Survey","volume":"34","author":"Arulkumaran","year":"2017","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_43","unstructured":"Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., and Freitas, N. (2016, January 20\u201322). Dueling Network Architectures for Deep Reinforcement Learning. Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Sutton, R.S. (1992). Reinforcement Learning, Springer US.","DOI":"10.1007\/978-1-4615-3618-5"},{"key":"ref_45","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. 2017. arXiv."},{"key":"ref_46","unstructured":"Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., and Kavukcuoglu, K. (2016, January 20\u201322). Asynchronous Methods for Deep Reinforcement Learning. Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_47","unstructured":"Babaeizadeh, M., Frosio, I., Tyree, S., Clemons, J., and Kautz, J. (2017). Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU. arXiv."},{"key":"ref_48","unstructured":"Holliday, J.B., and Le, T.N. Follow then Forage Exploration: Improving Asynchronous Advantage Actor Critic. Proceedings of the Computer Science & Information Technology."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Caicedo, J.C., and Lazebnik, S. (2015, January 7\u201313). Active Object Localization with Deep Reinforcement Learning. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.286"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Mathe, S., Pirinen, A., and Sminchisescu, C. (2016, January 27\u201330). Reinforcement Learning for Visual Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.316"},{"key":"ref_51","first-page":"3","article-title":"Hierarchical object detection with deep reinforcement learning","volume":"31","author":"Bueno","year":"2017","journal-title":"Deep. Learn. Image Process. Appl."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Ayle, M., Tekli, J., El-Zini, J., El-Asmar, B., and Awad, M. (2020, January 7\u201312). BAR\u2014A Reinforcement Learning Agent for Bounding-Box Automated Refinement. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i03.5639"},{"key":"ref_53","unstructured":"Navarro, F., Sekuboyina, A., Waldmannstetter, D., Peeken, J.C., Combs, S.E., and Menze, B.H. (2020, January 6\u20138). Deep Reinforcement Learning for Organ Localization in CT. Proceedings of the Third Conference on Medical Imaging with Deep Learning, Montreal, QC, Canada."},{"key":"ref_54","unstructured":"Bhatt, A., Argus, M., Amiranashvili, A., and Brox, T. (2019). CrossNorm: Normalization for Off-Policy TD Reinforcement Learning. arXiv."},{"key":"ref_55","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2019). Continuous Control with Deep Reinforcement Learning. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). ImageNet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The Pascal Visual Object Classes (VOC) Challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_58","unstructured":"Fleet, D., Pajdla, T., Schiele, B., and Tuytelaars, T. (2014, January 6\u201312). Microsoft COCO: Common Objects in Context. Proceedings of the 13th European Conference, Zurich, Switzerland."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"2486","DOI":"10.1109\/TGRS.2016.2645610","article-title":"Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks","volume":"55","author":"Long","year":"2017","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.isprsjprs.2014.10.002","article-title":"Multi-class geospatial object detection and geographic image classification based on collection of part detectors","volume":"98","author":"Cheng","year":"2014","journal-title":"ISPRS J. Photogramm. Remote. Sens."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Cramer, M. (2010). The DGPF-Test on Digital Airborne Camera Evaluation Overview and Test Design, Photogrammetrie-Fernerkundung-Geoinformation Schweizerbart Science Publishers.","DOI":"10.1127\/1432-8364\/2010\/0041"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Padilla, R., Netto, S.L., and da Silva, E.A.B. (2020, January 1\u20133). A Survey on Performance Metrics for Object-Detection Algorithms. Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Niteroi, Brazil.","DOI":"10.1109\/IWSSIP48289.2020.9145130"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/20\/5031\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:10:03Z","timestamp":1760130603000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/20\/5031"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,19]]},"references-count":62,"journal-issue":{"issue":"20","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["rs15205031"],"URL":"https:\/\/doi.org\/10.3390\/rs15205031","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,19]]}}}