{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T06:19:45Z","timestamp":1773469185871,"version":"3.50.1"},"reference-count":53,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2022,1,17]],"date-time":"2022-01-17T00:00:00Z","timestamp":1642377600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Sichuan Science and Technology Program","award":["2021YFG0315"],"award-info":[{"award-number":["2021YFG0315"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>As one type of object detection, small object detection has been widely used in daily-life-related applications with many real-time requirements, such as autopilot and navigation. Although deep-learning-based object detection methods have achieved great success in recent years, they are not effective in small object detection and most of them cannot achieve real-time processing. Therefore, this paper proposes a single-stage small object detection network (SODNet) that integrates the specialized feature extraction and information fusion techniques. An adaptively spatial parallel convolution module (ASPConv) is proposed to alleviate the lack of spatial information for target objects and adaptively obtain the corresponding spatial information through multi-scale receptive fields, thereby improving the feature extraction ability. Additionally, a split-fusion sub-module (SF) is proposed to effectively reduce the time complexity of ASPConv. A fast multi-scale fusion module (FMF) is proposed to alleviate the insufficient fusion of both semantic and spatial information. FMF uses two fast upsampling operators to first unify the resolution of the multi-scale feature maps extracted by the network and then fuse them, thereby effectively improving the small object detection ability. Comparative experimental results prove that the proposed method considerably improves the accuracy of small object detection on multiple benchmark datasets and achieves a high real-time performance.<\/jats:p>","DOI":"10.3390\/rs14020420","type":"journal-article","created":{"date-parts":[[2022,1,17]],"date-time":"2022-01-17T20:49:21Z","timestamp":1642452561000},"page":"420","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":128,"title":["Small Object Detection Method Based on Adaptive Spatial Parallel Convolution and Fast Multi-Scale Fusion"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9562-3865","authenticated-orcid":false,"given":"Guanqiu","family":"Qi","sequence":"first","affiliation":[{"name":"Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2660-8324","authenticated-orcid":false,"given":"Yuanchuan","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7871-624X","authenticated-orcid":false,"given":"Kunpeng","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China"}]},{"given":"Neal","family":"Mazur","sequence":"additional","affiliation":[{"name":"Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA"}]},{"given":"Yang","family":"Liu","sequence":"additional","affiliation":[{"name":"BOE Technology Group Co., Ltd., Chongqing 400799, China"}]},{"given":"Devanshi","family":"Malaviya","sequence":"additional","affiliation":[{"name":"Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,17]]},"reference":[{"key":"ref_1","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (2016). R-fcn: Object detection via region-based fully convolutional networks. arXiv."},{"key":"ref_2","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv."},{"key":"ref_3","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., and Lin, D. (2019, January 15\u201320). Libra r-cnn: Towards balanced learning for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00091"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 14\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lu, X., Li, B., Yue, Y., Li, Q., and Yan, J. (2019, January 15\u201320). Grid r-cnn. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00754"},{"key":"ref_8","unstructured":"Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y.M. (2011). Scaled-yolov4: Scaling cross stage partial network. arXiv."},{"key":"ref_9","unstructured":"Jocher, G., Nishimura, K., and Mineeva, T. (2021, September 01). Yolov5. Available online: https:\/\/github.com\/ultralytics\/yolov5."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1049\/trit.2018.1045","article-title":"Convolutional neural network based detection and judgement of environmental obstacle in vehicle operation","volume":"4","author":"Qi","year":"2019","journal-title":"CAAI Trans. Intell. Technol."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201325). Focal loss for dense object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Yu, X., Gong, Y., Jiang, N., Ye, Q., and Han, Z. (2020, January 1\u20135). Scale match for tiny person detection. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA.","DOI":"10.1109\/WACV45572.2020.9093394"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhu, Z., Liang, D., Zhang, S., Huang, X., Li, B., and Hu, S. (2016, January 27\u201330). Traffic-sign detection and classification in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.232"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., and Tian, Q. (2018, January 8\u201314). The unmanned aerial vehicle benchmark: Object detection and tracking. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6_23"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"107997","DOI":"10.1016\/j.patcog.2021.107997","article-title":"Scale-balanced loss for object detection","volume":"117","author":"Shuang","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"107149","DOI":"10.1016\/j.patcog.2019.107149","article-title":"Mdfn: Multi-scale deep feature learning network for object detection","volume":"100","author":"Ma","year":"2020","journal-title":"Pattern Recognit."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"107929","DOI":"10.1016\/j.patcog.2021.107929","article-title":"Stdnet-st: Spatio-temporal convnet for small object detection","volume":"116","author":"Bosquet","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"107867","DOI":"10.1016\/j.patcog.2021.107867","article-title":"Spatial context-aware network for salient object detection","volume":"114","author":"Kong","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_21","unstructured":"Zou, Z., Shi, Z., Guo, Y., and Ye, J. (2019). Object detection in 20 years: A survey. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Qi, G.-J. (2016, January 27\u201330). Hierarchically gated deep networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.249"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Liu, S., and Huang, D. (2018, January 8\u201314). Receptive field block net for accurate and fast object detection. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01252-6_24"},{"key":"ref_24","unstructured":"Yu, F., and Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A. (2017, January 4\u20139). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zhu, Z., Luo, Y., Wei, H., Li, Y., Qi, G., Mazur, N., Li, Y., and Li, P. (2021). Atmospheric Light Estimation Based Remote Sensing Image Dehazing. Remote Sens., 13.","DOI":"10.3390\/rs13132432"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 22\u201325). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201322). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"8062","DOI":"10.1109\/JSEN.2020.2981719","article-title":"Image Dehazing by an Artificial Image Fusion Method Based on Adaptive Structure Decomposition","volume":"20","author":"Zheng","year":"2020","journal-title":"IEEE Sens. J."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhu, Z., Luo, Y., Qi, G., Meng, J., Li, Y., and Mazur, N. (2021). Remote Sensing Image Defogging Networks Based on Dual Self-Attention Boost Residual Octave Convolution. Remote Sens., 13.","DOI":"10.3390\/rs13163104"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhu, X., Hu, H., Lin, S., and Dai, J. (2019, January 15\u201320). Deformable convnets v2: More deformable, better results. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00953"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Hu, X., and Yang, J. (2019, January 15\u201320). Selective kernel networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00060"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., and Yan, S. (2017, January 22\u201325). Perceptual generative adversarial networks for small object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.211"},{"key":"ref_34","unstructured":"Noh, J., Bae, W., Lee, W., Seo, J., and Kim, G. (November, January 27). Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Shi, W., Caballero, J., Husz\u00e1r, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., and Wang, Z. (2016, January 27\u201330). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.207"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Wang, Z., Lin, Z., and Qi, H. (2019, January 15\u201320). Image super-resolution by neural texture transfer. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00817"},{"key":"ref_37","first-page":"1","article-title":"A Novel Fast Single Image Dehazing Algorithm Based on Artificial Multiexposure Image Fusion","volume":"70","author":"Zhu","year":"2021","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Islam, M.A., Rochan, M., Bruce, N.D., and Wang, Y. (2017, January 22\u201325). Gated feedback refinement network for dense image labeling. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.518"},{"key":"ref_39","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France."},{"key":"ref_40","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (November, January 27). Searching for mobilenetv3. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_42","unstructured":"Luo, W., Li, Y., Urtasun, R., and Zemel, R. (2017). Understanding the effective receptive field in deep convolutional neural networks. arXiv."},{"key":"ref_43","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1016\/j.ins.2020.02.067","article-title":"Dc-spp-yolo: Dense connection and spatial pyramid pooling based yolo for object detection","volume":"522","author":"Huang","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_45","unstructured":"Deng, C., Wang, M., Liu, L., and Liu, Y. (2003). Extended feature pyramid network for small object detection. arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Gong, Y., Yu, X., Ding, Y., Peng, X., Zhao, J., and Han, Z. (2021, January 5\u20139). Effective fusion factor in fpn for tiny object detection. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Virtual.","DOI":"10.1109\/WACV48630.2021.00120"},{"key":"ref_48","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"1758","DOI":"10.1109\/TCSVT.2019.2905881","article-title":"Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis","volume":"30","author":"Liang","year":"2019","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Meng, Z., Fan, X., Chen, X., Chen, M., and Tong, Y. (2017, January 4\u20136). Detecting small signs from large images. Proceedings of the 18th IEEE International Conference on Information Reuse and Integration (IRI), San Diego, CA, USA.","DOI":"10.1109\/IRI.2017.57"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Kong, T., Sun, F., Yao, A., Liu, H., Lu, M., and Chen, Y. (2017, January 22\u201325). Ron: Reverse connection with objectness prior networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.557"},{"key":"ref_52","unstructured":"Yang, F., Fan, H., Chu, P., Blasch, E., and Ling, H. (November, January 27). Clustered object detection in aerial images. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 13\u201316). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.123"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/2\/420\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:02:36Z","timestamp":1760133756000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/2\/420"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,17]]},"references-count":53,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2022,1]]}},"alternative-id":["rs14020420"],"URL":"https:\/\/doi.org\/10.3390\/rs14020420","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,17]]}}}