{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T08:00:00Z","timestamp":1761897600850,"version":"build-2065373602"},"reference-count":42,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2021,8,22]],"date-time":"2021-08-22T00:00:00Z","timestamp":1629590400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Unmanned Aerial Vehicles (UAVs) can serve as an ideal mobile platform in various situations. Real-time object detection with on-board apparatus provides drones with increased flexibility as well as a higher intelligence level. In order to achieve good detection results in UAV images with complex ground scenes, small object size and high object density, most of the previous work introduced models with higher computational burdens, making deployment on mobile platforms more difficult.This paper puts forward a lightweight object detection framework. Besides being anchor-free, the framework is based on a lightweight backbone and a simultaneous up-sampling and detection module to form a more efficient detection architecture. Meanwhile, we add an objectness branch to assist the multi-class center point prediction, which notably improves the detection accuracy and only takes up very little computing resources. The results of the experiment indicate that the computational cost of this paper is 92.78% lower than the CenterNet with ResNet18 backbone, and the mAP is 2.8 points higher on the Visdrone-2018-VID dataset. A frame rate of about 220 FPS is achieved. Additionally, we perform ablation experiments to check on the validity of each part, and the method we propose is compared with other representative lightweight object detection methods on UAV image datasets.<\/jats:p>","DOI":"10.3390\/s21165656","type":"journal-article","created":{"date-parts":[[2021,8,22]],"date-time":"2021-08-22T22:59:27Z","timestamp":1629673167000},"page":"5656","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Lightweight Detection Network Based on Sub-Pixel Convolution and Objectness-Aware Structure for UAV Images"],"prefix":"10.3390","volume":"21","author":[{"given":"Xuanye","family":"Li","sequence":"first","affiliation":[{"name":"School of Electrical and Information Engineering, Beihang University, Beijing 100191, China"}]},{"given":"Hongguang","family":"Li","sequence":"additional","affiliation":[{"name":"Unmanned System Research Institute, Beihang University, Beijing 100191, China"}]},{"given":"Yalong","family":"Jiang","sequence":"additional","affiliation":[{"name":"Unmanned System Research Institute, Beihang University, Beijing 100191, China"}]},{"given":"Meng","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Electrical and Information Engineering, Beihang University, Beijing 100191, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,8,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1109\/MCOM.2017.1600238CM","article-title":"UAV-enabled intelligent transportation systems for the smart city: Applications and challenges","volume":"55","author":"Menouar","year":"2017","journal-title":"IEEE Commun. Mag."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1109\/TITS.2018.2797697","article-title":"Real-Time Traffic Flow Parameter Estimation from UAV Video Based on Ensemble Classifier and Optical Flow","volume":"20","author":"Ke","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"890","DOI":"10.1109\/TITS.2016.2595526","article-title":"Real-Time Bidirectional Traffic Flow Parameter Estimation from Aerial Videos","volume":"18","author":"Ke","year":"2017","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1845","DOI":"10.1109\/TITS.2016.2617202","article-title":"An Enhanced Viola-Jones Vehicle Detection Method from Unmanned Aerial Vehicles Imagery","volume":"18","author":"Xu","year":"2017","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1109\/TITS.2017.2782790","article-title":"Effective and Efficient Detection of Moving Targets from a UAV\u2019s Camera","volume":"19","author":"Minaeian","year":"2018","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Zhang, X., Izquierdo, E., and Chandramouli, K. (2019, January 27\u201328). Dense and Small Object Detection in UAV Vision Based on Cascade Network. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCVW.2019.00020"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Chen, C., Zhang, Y., Lv, Q., Wei, S., Wang, X., Sun, X., and Dong, J. (2019, January 27\u201328). RRNet: A Hybrid Detector for Object Detection in Drone-Captured Images. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCVW.2019.00018"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"4968","DOI":"10.1109\/JSTARS.2018.2879368","article-title":"Urban traffic density estimation based on ultrahigh-resolution uav video and deep neural network","volume":"11","author":"Zhu","year":"2018","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Tang, Z., Liu, X., Shen, G., and Yang, B. (2020). PENet: Object Detection Using Points Estimation in Aerial Images. arXiv.","DOI":"10.1109\/ICMLA51294.2020.00069"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hong, S., Kang, S., and Cho, D. (2019, January 27\u201328). Patch-Level Augmentation for Object Detection in Aerial Images. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCVW.2019.00021"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Shi, W., Caballero, J., Husz\u00e1r, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., and Wang, Z. (2016, January 27\u201330). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.207"},{"key":"ref_13","first-page":"91","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_14","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (2016, January 5\u201310). R-FCN: Object Detection via Region-based Fully Convolutional Networks. Proceedings of the Advances in Neural Information Processing Systems (NIPS), Barcelona, Spain."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade r-cnn: Delving into high quality object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). Ssd: Single shot multibox detector. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_18","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An Incremental Improvement. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 21\u201326). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Shen, Z., Liu, Z., Li, J., Jiang, Y.G., Chen, Y., and Xue, X. (2017, January 21\u201326). Dsod: Learning deeply supervised object detectors from scratch. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.212"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, S., Wen, L., Bian, X., Lei, Z., and Li, S.Z. (2018, January 18\u201323). Single-shot refinement neural network for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00442"},{"key":"ref_22","unstructured":"Wang, R.J., Li, X., and Ling, C.X. (2018, January 18\u201323). Pelee: A real-time object detection system on mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA."},{"key":"ref_23","unstructured":"Iandola, F., Moskewicz, M., Karayev, S., Girshick, R., Darrell, T., and Keutzer, K. (2014). Densenet: Implementing Efficient Convnet Descriptor Pyramids. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Womg, A., Shafiee, M.J., Li, F., and Chwyl, B. (2018, January 8\u201310). Tiny SSD: A tiny single-shot detection deep convolutional neural network for real-time embedded object detection. Proceedings of the 2018 15th Conference on Computer and Robot Vision (CRV), Toronto, ON, Canada.","DOI":"10.1109\/CRV.2018.00023"},{"key":"ref_25","unstructured":"Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., and Keutzer, K. (2016). SqueezeNet: AlexNet-Level Accuracy with 50x Fewer Parameters and <0.5 MB Model Size. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4357","DOI":"10.1109\/TIP.2018.2835143","article-title":"Gabor Convolutional Networks","volume":"27","author":"Luan","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Huang, R., Pedoeem, J., and Chen, C. (2018, January 10\u201313). YOLO-LITE: A real-time object detection algorithm optimized for non-GPU computers. Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA.","DOI":"10.1109\/BigData.2018.8621865"},{"key":"ref_28","unstructured":"Li, Y., Li, J., Lin, W., and Li, J. (2018). Tiny-Dsod: Lightweight Object Detection for Resource-Restricted Usages. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Law, H., and Deng, J. (2018, January 8\u201314). Cornernet: Detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"ref_30","unstructured":"Zhou, X., Wang, D., and Kr\u00e4henb\u00fchl, P. (2019). Objects as Points. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Yang, Z., Liu, S., Hu, H., Wang, L., and Lin, S. (2019, January 27\u201328). Reppoints: Point set representation for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00975"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Mao, M., Tian, Y., Zhang, B., Ye, Q., Liu, W., and Doermann, D. (2021). iffDetector: Inference-aware Feature Filtering for Object Detection. IEEE TNNLS, IEEE.","DOI":"10.1109\/TNNLS.2021.3081864"},{"key":"ref_33","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"3652","DOI":"10.1109\/JSTARS.2017.2694890","article-title":"Toward fast and accurate vehicle detection in aerial images using coupled region-based convolutional neural networks","volume":"10","author":"Deng","year":"2017","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1884","DOI":"10.1109\/LGRS.2019.2956513","article-title":"DAGN: A Real-Time UAV Remote Sensing Image Vehicle Detection Framework","volume":"17","author":"Ke","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1556","DOI":"10.1109\/TIP.2020.3045636","article-title":"A Global-Local Self-Adaptive Network for Drone-View Object Detection","volume":"30","author":"Deng","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wang, T., Anwer, R.M., Cholakkal, H., Khan, F.S., Pang, Y., and Shao, L. (2019, January 27\u201328). Learning rich features at high-speed for single-shot object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00206"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"e3","DOI":"10.23915\/distill.00003","article-title":"Deconvolution and Checkerboard Artifacts","volume":"1","author":"Odena","year":"2016","journal-title":"Distill"},{"key":"ref_40","unstructured":"Zhu, P., Wen, L., Bian, X., Ling, H., and Hu, Q. (2018). Vision Meets Drones: A Challenge. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., and Tian, Q. (2018, January 8\u201314). The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6_23"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (2019, January 27\u201328). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00140"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/16\/5656\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:49:23Z","timestamp":1760165363000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/16\/5656"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,22]]},"references-count":42,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2021,8]]}},"alternative-id":["s21165656"],"URL":"https:\/\/doi.org\/10.3390\/s21165656","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,8,22]]}}}