{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:06:51Z","timestamp":1760144811282,"version":"build-2065373602"},"reference-count":49,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T00:00:00Z","timestamp":1716508800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62371135","U21A20471","2023J01431","JAT210036"],"award-info":[{"award-number":["62371135","U21A20471","2023J01431","JAT210036"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003392","name":"Natural Science Foundation of Fujian Province","doi-asserted-by":"publisher","award":["62371135","U21A20471","2023J01431","JAT210036"],"award-info":[{"award-number":["62371135","U21A20471","2023J01431","JAT210036"]}],"id":[{"id":"10.13039\/501100003392","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Research Program for Young and Middle-Aged Teachers of Fujian Province","award":["62371135","U21A20471","2023J01431","JAT210036"],"award-info":[{"award-number":["62371135","U21A20471","2023J01431","JAT210036"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Generally, the interesting objects in aerial images are completely different from objects in nature, and the remote sensing objects in particular tend to be more distinctive in aspect ratio. The existing convolutional networks have equal aspect ratios of the receptive fields, which leads to receptive fields either containing non-relevant information or being unable to fully cover the entire object. To this end, we propose Horizontal and Vertical Convolution, which is a plug-and-play module to address different aspect ratio problems. In our method, we introduce horizontal convolution and vertical convolution to expand the receptive fields in the horizontal and vertical directions, respectively, to reduce redundant receptive fields, so that remote sensing objects with different aspect ratios can achieve better receptive fields coverage, thereby achieving more accurate feature representation. In addition, we design an attention module to dynamically aggregate these two sub-modules to achieve more accurate feature coverage. Extensive experimental results on the DOTA and HRSC2016 datasets show that our HVConv achieves accuracy improvements in diverse detection architectures and obtains SOTA accuracy (mAP score of 77.60% with DOTA single-scale training and mAP score of 81.07% with DOTA multi-scale training). Various ablation studies were conducted as well, which is enough to verify the effectiveness of our model.<\/jats:p>","DOI":"10.3390\/rs16111880","type":"journal-article","created":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T08:30:22Z","timestamp":1716539422000},"page":"1880","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["HVConv: Horizontal and Vertical Convolution for Remote Sensing Object Detection"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-9040-4055","authenticated-orcid":false,"given":"Jinhui","family":"Chen","sequence":"first","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8547-1802","authenticated-orcid":false,"given":"Qifeng","family":"Lin","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-8866-008X","authenticated-orcid":false,"given":"Haibin","family":"Huang","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuanlong","family":"Yu","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4351-2359","authenticated-orcid":false,"given":"Daoye","family":"Zhu","sequence":"additional","affiliation":[{"name":"College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gang","family":"Fu","sequence":"additional","affiliation":[{"name":"Department of Computing, The Hong Kong Polytechnic University, Hong Kong 999077, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,24]]},"reference":[{"key":"ref_1","first-page":"1","article-title":"Align Deep Features for Oriented Object Detection","volume":"60","author":"Han","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Yang, X., Yan, J., Feng, Z., and He, T. (2021, January 4\u20137). R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v35i4.16426"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Pan, X., Ren, Y., Sheng, K., Dong, W., Yuan, H., Guo, X., Ma, C., and Xu, C. (2020, January 14\u201319). Dynamic refinement network for oriented and densely packed object detection. Proceedings of the 2020 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.01122"},{"key":"ref_4","first-page":"1","article-title":"MEDNet: Multiexpert Detection Network With Unsupervised Clustering of Training Samples","volume":"60","author":"Lin","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","unstructured":"Feng, L.Q., Luo Jun, L., Yuan Long, Y., and Fu, G. (November, January 29). A Multiple Prediction Mechanisms Ensemble for Complex Remote Sensing Scenes. Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Li, Y., Hou, Q., Zheng, Z., Cheng, M.M., Yang, J., and Li, X. (2023, January 2\u20136). Large Selective Kernel Network for Remote Sensing Object Detection. Proceedings of the 2023 IEEE International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.01540"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pu, Y., Wang, Y., Xia, Z., Han, Y., Wang, Y., Gan, W., Wang, Z., Song, S., and Huang, G. (2023, January 2\u20136). Adaptive Rotated Convolution for Rotated Object Detection. Proceedings of the 2023 IEEE International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00606"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-Excitation Networks. Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1074","DOI":"10.1109\/LGRS.2016.2565705","article-title":"Ship Rotated Bounding Box Space for Ship Extraction from High-Resolution Optical Satellite Images with Complex Backgrounds","volume":"13","author":"Liu","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Xia, G.S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201322). DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhang, G., Yu, W., and Hou, R. (2024). MFIL-FCOS: A Multi-Scale Fusion and Interactive Learning Method for 2D Object Detection and Remote Sensing Image Detection. Remote Sens., 16.","DOI":"10.3390\/rs16060936"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"416","DOI":"10.1109\/TNNLS.2020.3027924","article-title":"CRPN-SFNet: A high-performance object detector on large-scale remote sensing images","volume":"33","author":"Lin","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the 2017 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Lin, Q., Zhao, J., Tong, Q., Zhang, G., Yuan, Z., and Fu, G. (2019, January 8\u201312). Cropping Region Proposal Network Based Framework for Efficient Object Detection on Large Scale Remote Sensing Images. Proceedings of the 2019 IEEE International Conference on Multimedia and Expo, Shanghai, China.","DOI":"10.1109\/ICME.2019.00265"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"108315","DOI":"10.1016\/j.patcog.2021.108315","article-title":"DDBN: Dual detection branch network for semantic diversity predictions","volume":"122","author":"Lin","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","article-title":"Arbitrary-oriented scene text detection via rotation proposals","volume":"20","author":"Ma","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ding, J., Xue, N., Long, Y., Xia, G.S., and Lu, Q. (2019, January 16\u201320). Learning roi transformer for oriented object detection in aerial images. Proceedings of the 2019 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00296"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Li, J., Chen, M., Hou, S., Wang, Y., Luo, Q., and Wang, C. (2023). An Improved S2A-Net Algorithm for Ship Object Detection in Optical Remote Sensing Images. Remote Sens., 15.","DOI":"10.3390\/rs15184559"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Hu, X., and Yang, J. (2019, January 16\u201320). Selective kernel networks. Proceedings of the 2019 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00060"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the Inception Architecture for Computer Vision. Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_24","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, January 3\u20137). An image is worth 16X16 Words: Transformers for image recognition at scale. Proceedings of the ICLR 2021\u20149th International Conference on Learning Representations, Virtual."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Guan, X., Dong, Y., Tan, W., Su, Y., and Huang, P. (2024). A Parameter-Free Pixel Correlation-Based Attention Module for Remote Sensing Object Detection. Remote Sens., 16.","DOI":"10.3390\/rs16020312"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Xie, X., Cheng, G., Wang, J., Yao, X., and Han, J. (2021, January 11\u201317). Oriented R-CNN for Object Detection. Proceedings of the 2021 IEEE International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCV48922.2021.00350"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Yang, X., Zhang, G., Wang, J., Liu, Y., Hou, L., Jiang, X., Liu, X., Yan, J., and Lyu, C. (2022, January 10\u201314). MMRotate: A Rotated Object Detection Benchmark using PyTorch. Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal.","DOI":"10.1145\/3503161.3548541"},{"key":"ref_28","unstructured":"Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., and Zisserman, A. (2024, March 21). The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. Available online: http:\/\/host.robots.ox.ac.uk\/pascal\/VOC\/voc2007\/."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, Z., Chen, K., Lin, W., See, J., Yu, H., Ke, Y., and Yang, C. (2020, January 29). PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments. Proceedings of the Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Glasgow, UK.","DOI":"10.1007\/978-3-030-58558-7_12"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Qian, W., Yang, X., Peng, S., Yan, J., and Guo, Y. (2021, January 4\u20137). Learning Modulated Loss for Rotated Object Detection. Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v35i3.16347"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ming, Q., Zhou, Z., Miao, L., Zhang, H., and Li, L. (2021, January 4\u20137). Dynamic Anchor Learning for Arbitrary-Oriented Object Detection. Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v35i3.16336"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Huo, L., Hou, J., Feng, J., Wang, W., and Liu, J. (2024). Global and Multiscale Aggregate Network for Saliency Object Detection in Optical Remote Sensing Images. Remote Sens., 16.","DOI":"10.3390\/rs16040624"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Shen, Y., Liu, D., Chen, J., Wang, Z., Wang, Z., and Zhang, Q. (2023). On-Board Multi-Class Geospatial Object Detection Based on Convolutional Neural Network for High Resolution Remote Sensing Images. Remote Sens., 15.","DOI":"10.3390\/rs15163963"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., Sun, X., and Fu, K. (November, January 27). SCRDet: Towards more robust detection for small, cluttered and rotated objects. Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00832"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1452","DOI":"10.1109\/TPAMI.2020.2974745","article-title":"Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection","volume":"43","author":"Xu","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Li, C., Xu, C., Cui, Z., Wang, D., Zhang, T., and Yang, J. (2019, January 22\u201325). Feature-Attentioned Object Detection in Remote Sensing Imagery. Proceedings of the 2019 International Conference on Image Processing, Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803521"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., and Yang, W. (2019). Mask OBB: A Semantic Attention-Based Mask Oriented Bounding Box Representation for Multi-Category Object Detection in Aerial Images. Remote Sens., 11.","DOI":"10.3390\/rs11242930"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Han, J., Ding, J., Xue, N., and Xia, G.S. (June, January 19). ReDeT: A Rotation-equivariant Detector for Aerial Object Detection. Proceedings of the 2021 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00281"},{"key":"ref_39","first-page":"1","article-title":"Anchor-Free Oriented Proposal Generator for Object Detection","volume":"60","author":"Cheng","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Hou, L., Lu, K., Xue, J., and Li, Y. (March, January 22). Shape-Adaptive Selection and Measurement for Oriented Object Detection. Proceedings of the 36th AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v36i1.19975"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1145\/3513133","article-title":"Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion","volume":"19","author":"Zhen","year":"2023","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Yang, J., Liu, Q., and Zhang, K. (2017, January 21\u201326). Stacked Hourglass Network for Robust Facial Landmark Localisation. Proceedings of the 2017 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.253"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Yu, F., Wang, D., Shelhamer, E., and Darrell, T. (2018, January 18\u201322). Deep Layer Aggregation. Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00255"},{"key":"ref_44","unstructured":"Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., and Tian, Q. (2021, January 18\u201324). Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss. Proceedings of the 2021 Machine Learning Research, Virtual."},{"key":"ref_45","unstructured":"Yang, X., Yang, X., Yang, J., Ming, Q., Wang, W., Tian, Q., and Yan, J. (2021, January 6\u201314). Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence. Proceedings of the 2021 Advances in Neural Information Processing Systems, Virtual."},{"key":"ref_46","unstructured":"Yang, X., Zhou, Y., Zhang, G., Yang, J., Wang, W., Yan, J., Zhang, X., and Tian, Q. (2022). The KFIoU Loss for Rotated Object Detection. arXiv."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2023.3335484","article-title":"Advancing Plain Vision Transformer Toward Remote Sensing Foundation Model","volume":"61","author":"Wang","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the 2021 IEEE International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"5252","DOI":"10.1109\/TCSVT.2022.3140248","article-title":"Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection","volume":"32","author":"Guo","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/11\/1880\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:48:10Z","timestamp":1760107690000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/11\/1880"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,24]]},"references-count":49,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["rs16111880"],"URL":"https:\/\/doi.org\/10.3390\/rs16111880","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2024,5,24]]}}}