{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T19:07:13Z","timestamp":1775070433636,"version":"3.50.1"},"reference-count":51,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2022,5,20]],"date-time":"2022-05-20T00:00:00Z","timestamp":1653004800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61862060"],"award-info":[{"award-number":["61862060"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61462079"],"award-info":[{"award-number":["61462079"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61562086"],"award-info":[{"award-number":["61562086"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>You Only Look Once (YOLO) series detectors are suitable for aerial image object detection because of their excellent real-time ability and performance. Their high performance depends heavily on the anchor generated by clustering the training set. However, the effectiveness of the general Anchor Generation algorithm is limited by the unique data distribution of the aerial image dataset. The divergence in the distribution of the number of objects with different sizes can cause the anchors to overfit some objects or be assigned to suboptimal layers because anchors of each layer are generated uniformly and affected by the overall data distribution. In this paper, we are inspired by experiments under different anchors settings and proposed the Layered Anchor Generation (LAG) algorithm. In the LAG, objects are layered by their diagonals, and then anchors of each layer are generated by analyzing the diagonals and aspect ratio of objects of the corresponding layer. In this way, anchors of each layer can better match the detection range of each layer. Experiment results showed that our algorithm is of good generality that significantly uprises the performance of You Only Look Once version 3 (YOLOv3), You Only Look Once version 5 (YOLOv5), You Only Learn One Representation (YOLOR), and Cascade Regions with CNN features (Cascade R-CNN) on the Vision Meets Drone (VisDrone) dataset and the object DetectIon in Optical Remote sensing images (DIOR) dataset, and these improvements are cost-free.<\/jats:p>","DOI":"10.3390\/s22103891","type":"journal-article","created":{"date-parts":[[2022,5,21]],"date-time":"2022-05-21T09:18:08Z","timestamp":1653124688000},"page":"3891","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["LAG: Layered Objects to Generate Better Anchors for Object Detection in Aerial Images"],"prefix":"10.3390","volume":"22","author":[{"given":"Xueqiang","family":"Wan","sequence":"first","affiliation":[{"name":"School of Software, Xinjiang University, Urumqi 830091, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiong","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Software, Xinjiang University, Urumqi 830091, China"},{"name":"College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haotian","family":"Tan","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junjie","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Software, Xinjiang University, Urumqi 830091, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,5,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3574","DOI":"10.1109\/TIP.2014.2329767","article-title":"Hyperspectral image segmentation using a new spectral unmixing-based binary partition tree representation","volume":"23","author":"Veganzones","year":"2014","journal-title":"IEEE Trans. Image Process."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"105928","DOI":"10.1016\/j.asoc.2019.105928","article-title":"Fuzzy C-means clustering through SSIM and patch for image segmentation","volume":"87","author":"Tang","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated Residual Transformations for Deep Neural Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1109\/TPAMI.2019.2938758","article-title":"Res2net: A new multi-scale backbone architecture","volume":"43","author":"Gao","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 13\u201319). ECA-net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1362","DOI":"10.1007\/s10489-021-02496-y","article-title":"DECA: A novel multi-scale efficient channel attention module for object detection in real-life fire images","volume":"52","author":"Wang","year":"2021","journal-title":"Appl. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Dai, Z., Cai, B., Lin, Y., and Chen, J. (2021, January 20\u201325). Up-detr: Unsupervised pre-training for object detection with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00165"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Law, H., and Deng, J. (2018, January 8\u201314). Cornernet: Detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"ref_15","unstructured":"Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. (November, January 27). Centernet: Keypoint triplets for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_16","unstructured":"Yang, Z., Liu, S., Hu, H., Wang, L., and Lin, S. (November, January 27). Reppoints: Point set representation for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhu, C., He, Y., and Savvides, M. (2019, January 15\u201320). Feature selective anchor-free module for single-shot object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00093"},{"key":"ref_18","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"7389","DOI":"10.1109\/TIP.2020.3002345","article-title":"Foveabox: Beyound anchor-based object detection","volume":"29","author":"Kong","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_21","unstructured":"Farhadi, A., and Redmon, J. (2018, January 18\u201323). Yolov3: An incremental improvement. Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA."},{"key":"ref_22","unstructured":"Jocher, G., Stoken, A., Borovec, J., Chaurasia, A., Xie, T., Liu, C., Abhiram, V. (2021). ultralytics\/yolov5: v5.0\u2014YOLOv5-P6 1280 Models, AWS, Supervise.ly and YouTube Integrations, CERN Data Centre & Invenio.. version 5.0."},{"key":"ref_23","unstructured":"Wang, C.Y., Yeh, I.H., and Liao, H.Y.M. (2021). You Only Learn One Representation: Unified Network for Multiple Tasks. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_25","first-page":"91","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade r-cnn: Delving into high quality object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_29","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Mark Liao, H.Y., Wu, Y.H., Chen, P.Y., Hsieh, J.W., and Yeh, I.H. (2020, January 14\u201319). CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Neubeck, A., and Van Gool, L. (2006, January 20\u201324). Efficient non-maximum suppression. Proceedings of the 18th International Conference on Pattern Recognition (ICPR\u201906), Hong Kong, China.","DOI":"10.1109\/ICPR.2006.479"},{"key":"ref_33","unstructured":"Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., Sun, X., and Fu, K. (November, January 27). Scrdet: Towards more robust detection for small, cluttered and rotated objects. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_34","unstructured":"Yang, X., Liu, Q., Yan, J., Li, A., Zhang, Z., and Yu, G. (2019). R3det: Refined single-stage detector with feature refinement for rotating object. arXiv."},{"key":"ref_35","unstructured":"Van Etten, A. (2018). You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zhu, X., Lyu, S., Wang, X., and Zhao, Q. (2021, January 11\u201317). TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00312"},{"key":"ref_37","unstructured":"MacQueen, J. (1967, January 1). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2433","DOI":"10.1007\/s00371-020-01997-0","article-title":"Joint information fusion and multi-scale network model for pedestrian detection","volume":"37","author":"Zhang","year":"2021","journal-title":"Vis. Comput."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Ju, M., Luo, H., Wang, Z., Hui, B., and Chang, Z. (2019). The application of improved YOLO V3 in multi-scale target detection. Appl. Sci., 9.","DOI":"10.3390\/app9183775"},{"key":"ref_40","unstructured":"Hurtik, P., Molek, V., Hula, J., Vajgl, M., Vlasanek, P., and Nejezchleba, T. (2020). Poly-YOLO: Higher speed, more precise detection and instance segmentation for YOLOv3. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1016\/j.isprsjprs.2019.11.023","article-title":"Object detection in optical remote sensing images: A survey and a new benchmark","volume":"159","author":"Li","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Cao, Y., He, Z., Wang, L., Wang, W., Yuan, Y., Zhang, D., Zhang, J., Zhu, P., Van Gool, L., and Han, J. (2021, January 11\u201317). VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00319"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Zhang, H., Cisse, M., Dauphin, Y.N., and Lopez-Paz, D. (2017). mixup: Beyond empirical risk minimization. arXiv.","DOI":"10.1007\/978-1-4899-7687-1_79"},{"key":"ref_45","unstructured":"Jocher, G., Kwon, Y., Veitch-Michaelis, J., Suess, D., Baltac\u0131, F., Bianconi, G. (2021). ultralytics\/yolov3: v9.5.0\u2014YOLOv5 v5.0 Release Compatibility Update for YOLOv3, CERN Data Centre & Invenio.. version 9.5.0."},{"key":"ref_46","unstructured":"Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., and Xu, J. (2019). MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv."},{"key":"ref_47","unstructured":"Zhu, B., Wang, J., Jiang, Z., Zong, F., Liu, S., Li, Z., and Sun, J. (2020). Autoassign: Differentiable label assignment for dense object detection. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Wang, N., Gao, Y., Chen, H., Wang, P., Tian, Z., Shen, C., and Zhang, Y. (2020, January 13\u201319). NAS-FCOS: Fast Neural Architecture Search for Object Detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01196"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Kim, K., and Lee, H.S. (2020, January 23\u201328). Probabilistic Anchor Assignment with IoU Prediction for Object Detection. Proceedings of the ECCV, Glasgow, UK.","DOI":"10.1007\/978-3-030-58595-2_22"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Zhang, H., Wang, Y., Dayoub, F., and S\u00fcnderhauf, N. (2021, January 20\u201325). VarifocalNet: An IoU-aware Dense Object Detector. Proceedings of the CVPR, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00841"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Zhang, S., Chi, C., Yao, Y., Lei, Z., and Li, S.Z. (2020, January 13\u201319). Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00978"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/10\/3891\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:15:35Z","timestamp":1760138135000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/10\/3891"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,20]]},"references-count":51,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2022,5]]}},"alternative-id":["s22103891"],"URL":"https:\/\/doi.org\/10.3390\/s22103891","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,20]]}}}