{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T18:26:28Z","timestamp":1775845588222,"version":"3.50.1"},"reference-count":57,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,25]],"date-time":"2022-10-25T00:00:00Z","timestamp":1666656000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China (NSFC)","doi-asserted-by":"publisher","award":["62171040"],"award-info":[{"award-number":["62171040"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China (NSFC)","doi-asserted-by":"publisher","award":["2021TQ0177"],"award-info":[{"award-number":["2021TQ0177"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["62171040"],"award-info":[{"award-number":["62171040"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["2021TQ0177"],"award-info":[{"award-number":["2021TQ0177"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In recent years, object detectors based on convolutional neural networks have been widely used on remote sensing images. However, the improvement of their detection performance depends on a deeper convolution layer and a complex convolution structure, resulting in a significant increase in the storage space and computational complexity. Although previous works have designed a variety of new lightweight convolution and compression algorithms, these works often require complex manual design and cause the detector to be greatly modified, which makes it difficult to directly apply the algorithms to different detectors and general hardware. Therefore, this paper proposes an iterative pruning framework based on assistant distillation. Specifically, a structured sparse pruning strategy for detectors is proposed. By taking the channel scaling factor as a representation of the weight importance, the channels of the network are pruned and the detector is greatly slimmed. Then, a teacher assistant distillation model is proposed to recover the network performance after compression. The intermediate models retained in the pruning process are used as assistant models. By way of the teachers distilling the assistants and the assistants distilling the students, the students\u2019 underfitting caused by the difference in capacity between teachers and students is eliminated, thus effectively restoring the network performance. By using this compression framework, we can greatly compress the network without changing the network structure and can obtain the support of any hardware platform and deep learning library. Extensive experiments show that compared with existing detection networks, our method can achieve an effective balance between speed and accuracy on three commonly used remote sensing target datasets (i.e., NWPU VHR-10, RSOD, and DOTA).<\/jats:p>","DOI":"10.3390\/rs14215347","type":"journal-article","created":{"date-parts":[[2022,10,26]],"date-time":"2022-10-26T07:17:48Z","timestamp":1666768668000},"page":"5347","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Sparse Channel Pruning and Assistant Distillation for Faster Aerial Object Detection"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3747-5128","authenticated-orcid":false,"given":"Chenwei","family":"Deng","sequence":"first","affiliation":[{"name":"School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3021-5371","authenticated-orcid":false,"given":"Donglin","family":"Jing","sequence":"additional","affiliation":[{"name":"School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhihan","family":"Ding","sequence":"additional","affiliation":[{"name":"School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7905-0163","authenticated-orcid":false,"given":"Yuqi","family":"Han","sequence":"additional","affiliation":[{"name":"Beijing National Research Center for Information Science and Technology, Institute for Artificial Intelligence, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1109\/LGRS.2013.2272492","article-title":"A new method on inshore ship detection in high-resolution satellite images using shape and context information","volume":"11","author":"Liu","year":"2013","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1109\/LGRS.2017.2664118","article-title":"Ship Detection From Optical Satellite Images Based on Saliency Segmentation and Structure-LBP Feature","volume":"14","author":"Yang","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1923","DOI":"10.1109\/TIP.2018.2878958","article-title":"An Augmented Linear Mixing Model to Address Spectral Variability for Hyperspectral Unmixing","volume":"28","author":"Hong","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_4","first-page":"5518615","article-title":"SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers","volume":"60","author":"Hong","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zhao, B., Zhao, B., Tang, L., Han, Y., and Wang, W. (2018). Deep Spatial-Temporal Joint Feature Representation for Video Object Detection. Sensors, 18.","DOI":"10.3390\/s18030774"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Tang, L., Tang, W., Qu, X., Han, Y., Wang, W., and Zhao, B. (2022). A scale-aware pyramid network for multi-scale object detection in sar images. Remote Sens., 14.","DOI":"10.3390\/rs14040973"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_8","unstructured":"Yang, X., Yan, J., Feng, Z., and He, T. (2019). R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. arXiv."},{"key":"ref_9","unstructured":"Han, J., Ding, J., Li, J., and Xia, G.S. (2020). Align Deep Features for Oriented Object Detection. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Han, J., Ding, J., Xue, N., and Xia, G.S. (2021). ReDet: A Rotation-equivariant Detector for Aerial Object Detection. arXiv.","DOI":"10.1109\/CVPR46437.2021.00281"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade R-CNN: Delving Into High Quality Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_13","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. Adv. Neural Inf. Process. Syst., 29."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Jiang, Y., Zhu, X., Wang, X., Yang, S., Li, W., Wang, H., Fu, P., and Luo, Z. (2017). R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection. arXiv.","DOI":"10.1109\/ICPR.2018.8545598"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","article-title":"Arbitrary-oriented scene text detection via rotation proposals","volume":"20","author":"Ma","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell., 39.","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_19","first-page":"1","article-title":"Far-net: Fast anchor refining for arbitrary-oriented object detection","volume":"19","author":"Deng","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wang, W., Han, Y., Deng, C., and Li, Z. (2022). Hyperspectral image classification via deep structure dictionary learning. Remote Sens., 14.","DOI":"10.3390\/rs14092266"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 28\u201323). Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ma, N., Zhang, X., Zheng, H.T., and Sun, J. (2018, January 8\u201314). Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zoph, B., Vasudevan, V., Shlens, J., and Le, Q.V. (2018, January 18\u201323). Learning transferable architectures for scalable image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00907"},{"key":"ref_24","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_26","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (November, January 27). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_27","unstructured":"Wang, R.J., Li, X., and Ling, C.X. (2018). Pelee: A real-time object detection system on mobile devices. Adv. Neural Inf. Process. Syst., 31."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"He, Y., Lin, J., Liu, Z., Wang, H., Li, L.J., and Han, S. (2018, January 8\u201314). Amc: Automl for model compression and acceleration on mobile devices. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_48"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"He, Y., Kang, G., Dong, X., Fu, Y., and Yang, Y. (2018). Soft filter pruning for accelerating deep convolutional neural networks. arXiv.","DOI":"10.24963\/ijcai.2018\/309"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., and Kalenichenko, D. (2018, January 18\u201323). Quantization and training of neural networks for efficient integer-arithmetic-only inference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00286"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"6414","DOI":"10.1109\/TCSVT.2022.3166803","article-title":"Baochang Zhang, and Yuqi Han. Rb-net: Training highly accurate and efficient binary neural networks with reshaped point-wise convolution and balanced activation","volume":"32","author":"Liu","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Liu, C., Ding, W., Xia, X., Hu, Y., Zhang, B., and Liu, J. (2019). Rbcn: Rectified binary convolutional networks for enhancing the performance of 1-bit dcnns. arXiv.","DOI":"10.1109\/CVPR.2019.00280"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, C., Ding, W., Xia, X., Zhang, B., Gu, J., Liu, J., Ji, R., and Doermann, D. (2019, January 15\u201320). Circulant binary convolutional networks: Enhancing the performance of 1-bit dcnns with circulant back propagation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00280"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Veit, A., and Belongie, S. (2018, January 8\u201314). Convolutional networks with adaptive inference graphs. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01246-5_1"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Wang, X., Yu, F., Dou, Z.Y., Darrell, T., and Gonzalez, J.E. (2018, January 8\u201314). Skipnet: Learning dynamic routing in convolutional networks. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_25"},{"key":"ref_36","unstructured":"Han, S., Mao, H., and Dally, W.J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Cho, J.H., and Hariharan, B. (November, January 27). On the Efficacy of Knowledge Distillation. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00489"},{"key":"ref_38","unstructured":"Yang, C., Xie, L., Qiao, S., and Yuille, A.L. (February, January 27). Training deep neural networks in generations: A more tolerant teacher educates better students. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_39","unstructured":"Molchanov, P., Tyree, S., Karras, T., Aila, T., and Kautz, J. (2016). Pruning convolutional neural networks for resource efficient inference. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Luo, J.H., Wu, J., and Lin, W. (2017, January 22\u201329). Thinet: A filter level pruning method for deep neural network compression. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.541"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"He, Y., Zhang, X., and Sun, J. (2017, January 22\u201329). Channel pruning for accelerating very deep neural networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.155"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Yu, R., Li, A., Chen, C.F., Lai, J.H., Morariu, V.I., Han, X., Gao, M., Lin, C.Y., and Davis, L.S. (2018, January 18\u201323). Nisp: Pruning networks using neuron importance score propagation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00958"},{"key":"ref_43","unstructured":"Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., Huang, J., and Zhu, J. (2018). Discrimination-aware channel pruning for deep neural networks. Adv. Neural Inf. Process. Syst., 31."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Bucilu\u01ce, C., Caruana, R., and Niculescu-Mizil, A. (2006, January 20\u201323). Model compression. Proceedings of the 12th ACM SIGKDD International Conference on KNOWLEDGE Discovery and Data Mining, Philadelphia, PA, USA.","DOI":"10.1145\/1150402.1150464"},{"key":"ref_45","unstructured":"Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural network. arXiv."},{"key":"ref_46","unstructured":"Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., and Bengio, Y. (2014). Fitnets: Hints for thin deep nets. arXiv."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Yim, J., Joo, D., Bae, J., and Kim, J. (2017, January 21\u201326). A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.754"},{"key":"ref_48","unstructured":"Czarnecki, W.M., Osindero, S., Jaderberg, M., Swirszcz, G., and Pascanu, R. (2017). Sobolev training for neural networks. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_49","unstructured":"Tarvainen, A., and Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_50","unstructured":"Urban, G., Geras, K.J., Kahou, S.E., Aslan, O., Wang, S., Caruana, R., Mohamed, A., Philipose, M., and Richardson, M. (2016). Do deep convolutional nets really need to be deep and convolutional?. arXiv."},{"key":"ref_51","unstructured":"Sau, B.B., and Balasubramanian, V.N. (2016). Deep model compression: Distilling knowledge from noisy teachers. arXiv."},{"key":"ref_52","first-page":"783","article-title":"Kdgan: Knowledge distillation with generative adversarial networks","volume":"31","author":"Wang","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"You, S., Xu, C., Xu, C., and Tao, D. (2017, January 13\u201317). Learning from multiple teacher networks. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada.","DOI":"10.1145\/3097983.3098135"},{"key":"ref_54","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_55","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_56","unstructured":"Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf, H.P. (2017, January 24\u201326). Pruning Filters for Efficient ConvNets. Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. (November, January 27). CenterNet: Keypoint Triplets for Object Detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00667"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5347\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:02:42Z","timestamp":1760144562000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5347"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,25]]},"references-count":57,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["rs14215347"],"URL":"https:\/\/doi.org\/10.3390\/rs14215347","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,25]]}}}