{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T18:17:11Z","timestamp":1772043431119,"version":"3.50.1"},"reference-count":51,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2024,12,5]],"date-time":"2024-12-05T00:00:00Z","timestamp":1733356800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Open Fund of Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake, Ministry of Natural Resources","award":["MEMI-2023-05"],"award-info":[{"award-number":["MEMI-2023-05"]}]},{"name":"Open Fund of Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake, Ministry of Natural Resources","award":["2024AAC05029"],"award-info":[{"award-number":["2024AAC05029"]}]},{"name":"Natural Science Foundation of Ningxia","award":["MEMI-2023-05"],"award-info":[{"award-number":["MEMI-2023-05"]}]},{"name":"Natural Science Foundation of Ningxia","award":["2024AAC05029"],"award-info":[{"award-number":["2024AAC05029"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>This paper proposes a semi-supervised query consistent transformer for optical remote sensing image object detection (SSOD-QCTR). A detection transformer (DETR)-like model is adopted as the basic network, and it follows the teacher\u2013student training scheme. The proposed method makes three major contributions. Firstly, to consider the problem of inaccurate pseudo-labels generated in the initial training epochs, a dynamic geometry-aware-based intersection over union (DGAIoU) loss function is proposed to dynamically update the weight coefficients according to the quality of the pseudo-labels in the current epoch. Secondly, we propose an improved focal (IF) loss function, which deals with the category imbalance problem by decreasing the category probability coefficients of the major categories. Thirdly, to solve the problem of uncertain correspondence between the output of the teacher and student models caused by the random initialization of the object queries, a query consistency (QC)-based loss function is proposed to introduce a consistency constraint of the outputs of the two models by taking the same regions of interest extracted from the pseudo-labels as the input object query. Extensive exploratory experiments on two publicly available datasets, DIOR and HRRSD, demonstrated that SSOD-QCTR outperforms the related methods, achieving a mAP of 65.28% and 81.73% for the DIOR and HRRSD datasets, respectively.<\/jats:p>","DOI":"10.3390\/rs16234556","type":"journal-article","created":{"date-parts":[[2024,12,5]],"date-time":"2024-12-05T04:13:03Z","timestamp":1733371983000},"page":"4556","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["SSOD-QCTR: Semi-Supervised Query Consistent Transformer for Optical Remote Sensing Image Object Detection"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-3321-7998","authenticated-orcid":false,"given":"Xinyu","family":"Ma","sequence":"first","affiliation":[{"name":"School of Information Engineering, Ningxia University, Yinchuan 750021, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8394-0078","authenticated-orcid":false,"given":"Pengyuan","family":"Lv","sequence":"additional","affiliation":[{"name":"School of Information Engineering, Ningxia University, Yinchuan 750021, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6700-2241","authenticated-orcid":false,"given":"Xunqiang","family":"Gong","sequence":"additional","affiliation":[{"name":"Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake of Ministry of Natural Resources, East China University of Technology, Nanchang 330013, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Feng, C., Zhong, Y., Gao, Y., Scott, M.R., and Huang, W. (2021, January 10\u201317). TOOD: Task-aligned One-stage Object Detection. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00349"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). FCOS: Fully Convolutional One-Stage Object Detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00972"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","first-page":"5610713","article-title":"Structure-Guided Feature Transform Hybrid Residual Network for Remote Sensing Object Detection","volume":"60","author":"Li","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","first-page":"5614914","article-title":"ABNet: Adaptive Balanced Network for Multiscale Object Detection in Remote Sensing Imagery","volume":"60","author":"Liu","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_9","first-page":"5603113","article-title":"Global to local: Clip-LSTM-based object detection from remote sensing images","volume":"60","author":"Teng","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Li, Q., Chen, Y., and Zeng, Y. (2022). Transformer with transfer CNN for remote-sensing-image object detection. Remote Sens., 14.","DOI":"10.3390\/rs14040984"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"6856","DOI":"10.1109\/JSTARS.2022.3198577","article-title":"Dual network structure with interweaved global-local feature hierarchy for transformer-based object detection in remote sensing image","volume":"15","author":"Xue","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2342","DOI":"10.1109\/TCSVT.2022.3222906","article-title":"AO2-DETR: Arbitrary-Oriented Object Detection Transformer","volume":"33","author":"Dai","year":"2023","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2337","DOI":"10.1109\/TGRS.2017.2778300","article-title":"Rotation-insensitive and context-augmented object detection in remote sensing images","volume":"56","author":"Li","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ye, X., Xiong, F., Lu, J., Zhou, J., and Qian, Y. (2020). F3-Net: Feature Fusion and Filtration Network for Object Detection in Optical Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12244027"},{"key":"ref_15","first-page":"5608412","article-title":"MRDet: A multihead network for accurate rotated object detection in aerial images","volume":"60","author":"Qin","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Xia, G.S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201323). DOTA: A large-scale dataset for object detection in aerial images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_17","first-page":"5405914","article-title":"SRAF-Net: A scene-relevant anchor-free object detection network in remote sensing images","volume":"60","author":"Liu","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_19","unstructured":"Liu, S., Li, F., Zhang, H., Yang, X., Qi, X., Su, H., Zhu, J., and Zhang, L. (2022). Dab-detr: Dynamic anchor boxes are better queries for detr. arXiv."},{"key":"ref_20","unstructured":"Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., and Shum, H.Y. (2022). Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"6005905","DOI":"10.1109\/LGRS.2024.3378531","article-title":"QETR: A Query-Enhanced Transformer for Remote Sensing Image Object Detection","volume":"21","author":"Ma","year":"2024","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2753","DOI":"10.1109\/JSTARS.2023.3254047","article-title":"MashFormer: A novel multiscale aware hybrid detector for remote sensing object detection","volume":"16","author":"Wang","year":"2023","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_23","first-page":"8000505","article-title":"Remote sensing object detection based on strong feature extraction and prescreening network","volume":"20","author":"Li","year":"2023","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_24","unstructured":"Sohn, K., Zhang, Z., Li, C.L., Zhang, H., Lee, C.Y., and Pfister, T. (2020). A simple semi-supervised learning framework for object detection. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Yang, Q., Wei, X., Wang, B., Hua, X.S., and Zhang, L. (2021, January 20\u201325). Interactive Self-Training with Mean Teachers for Semi-supervised Object Detection. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00588"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Wang, Z., Li, Y., Guo, Y., Fang, L., and Wang, S. (2021, January 20\u201325). Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00454"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhou, Q., Yu, C., Wang, Z., Qian, Q., and Li, H. (2021, January 20\u201325). Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00407"},{"key":"ref_28","unstructured":"Liu, Y.C., Ma, C.Y., He, Z., Kuo, C.W., Chen, K., Zhang, P., Wu, B., Kira, Z., and Vajda, P. (2021). Unbiased teacher for semi-supervised object detection. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Liu, Y.C., Ma, C.Y., and Kira, Z. (2022, January 18\u201324). Unbiased Teacher v2: Semi-supervised Object Detection for Anchor-free and Anchor-based Detectors. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00959"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Xu, M., Zhang, Z., Hu, H., Wang, J., Wang, L., Wei, F., Bai, X., and Liu, Z. (2021, January 10\u201317). End-to-end semi-supervised object detection with soft teacher. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00305"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Tang, Y., Chen, W., Luo, Y., and Zhang, Y. (2021, January 20\u201325). Humble Teachers Teach Better Students for Semi-Supervised Object Detection. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00315"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Mi, P., Lin, J., Zhou, Y., Shen, Y., Luo, G., Sun, X., Cao, L., Fu, R., Xu, Q., and Ji, R. (2022, January 18\u201324). Active Teacher for Semi-Supervised Object Detection. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01408"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, C., Zhang, W., Lin, X., Zhang, W., Tan, X., Han, J., Li, X., Ding, E., and Wang, J. (2023, January 17\u201324). Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01495"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, H., Liu, L., Zhang, B., Zhang, J., Zhang, W., Gan, Z., Wang, Y., Wang, C., and Wang, H. (2023, January 7\u201314). Calibrated teacher for sparsely annotated object detection. Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA.","DOI":"10.1609\/aaai.v37i2.25349"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Chen, B., Li, P., Chen, X., Wang, B., Zhang, L., and Hua, X.S. (2022, January 18\u201324). Dense learning based semi-supervised object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00477"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zhou, H., Ge, Z., Liu, S., Mao, W., Li, Z., Yu, H., and Sun, J. (2022, January 23\u201327). Dense teacher: Dense pseudo-labels for semi-supervised object detection. Proceedings of the 2022 European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-20077-9_3"},{"key":"ref_37","first-page":"10759","article-title":"Consistency-based semi-supervised learning for object detection","volume":"32","author":"Jeong","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wang, X., Yang, X., Zhang, S., Li, Y., Feng, L., Fang, S., Lyu, C., Chen, K., and Zhang, W. (2023, January 17\u201324). Consistent-Teacher: Towards Reducing Inconsistent Pseudo-Targets in Semi-Supervised Object Detection. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00316"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Jia, D., Yuan, Y., He, H., Wu, X., Yu, H., Lin, W., Sun, L., Zhang, C., and Hu, H. (2023, January 17\u201324). Detrs with hybrid matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01887"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhang, J., Lin, X., Zhang, W., Wang, K., Tan, X., Han, J., Ding, E., Wang, J., and Li, G. (2023, January 17\u201324). Semi-DETR: Semi-Supervised Object Detection with Detection Transformers. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02280"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"146","DOI":"10.1016\/j.neucom.2022.07.042","article-title":"Focal and efficient IOU loss for accurate bounding box regression","volume":"506","author":"Zhang","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Yu, J., Jiang, Y., Wang, Z., Cao, Z., and Huang, T. (2016, January 15\u201319). Unitbox: An advanced object detection network. Proceedings of the 24th ACM International Conference on Multimedia, Amsterdam, The Netherlands.","DOI":"10.1145\/2964284.2967274"},{"key":"ref_44","unstructured":"Leng, Z., Tan, M., Liu, C., Cubuk, E.D., Shi, X., Cheng, S., and Anguelov, D. (2022). Polyloss: A polynomial expansion perspective of classification loss functions. arXiv."},{"key":"ref_45","unstructured":"Xuan, G., Zhang, W., and Chai, P. (2001, January 7\u201310). EM algorithms of Gaussian mixture model and hidden Markov model. Proceedings of the 2001 International Conference on Image Processing (Cat. No. 01CH37205), Thessaloniki, Greece."},{"key":"ref_46","first-page":"21002","article-title":"Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection","volume":"33","author":"Li","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1016\/j.isprsjprs.2019.11.023","article-title":"Object detection in optical remote sensing images: A survey and a new benchmark","volume":"159","author":"Li","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"5535","DOI":"10.1109\/TGRS.2019.2900302","article-title":"Hierarchical and robust convolutional neural network for very high-resolution remote sensing object detection","volume":"57","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_50","unstructured":"Loshchilov, I., and Hutter, F. (2017). Decoupled weight decay regularization. arXiv."},{"key":"ref_51","unstructured":"Cho, Y.J. (2021). Weighted intersection over union (wIoU): A new evaluation metric for image segmentation. arXiv."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/23\/4556\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:47:15Z","timestamp":1760114835000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/23\/4556"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,5]]},"references-count":51,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["rs16234556"],"URL":"https:\/\/doi.org\/10.3390\/rs16234556","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,5]]}}}