{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,19]],"date-time":"2026-06-19T22:42:33Z","timestamp":1781908953055,"version":"3.54.5"},"reference-count":51,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2024,3,2]],"date-time":"2024-03-02T00:00:00Z","timestamp":1709337600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62171342"],"award-info":[{"award-number":["62171342"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["SKLGIE2023-M-3-1"],"award-info":[{"award-number":["SKLGIE2023-M-3-1"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100011354","name":"State Key Laboratory of Geo-Information Engineering","doi-asserted-by":"publisher","award":["62171342"],"award-info":[{"award-number":["62171342"]}],"id":[{"id":"10.13039\/501100011354","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100011354","name":"State Key Laboratory of Geo-Information Engineering","doi-asserted-by":"publisher","award":["SKLGIE2023-M-3-1"],"award-info":[{"award-number":["SKLGIE2023-M-3-1"]}],"id":[{"id":"10.13039\/501100011354","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Due to the absence of communication and coordination with external spacecraft, non-cooperative spacecraft present challenges for the servicing spacecraft in acquiring information about their pose and location. The accurate segmentation of non-cooperative spacecraft components in images is a crucial step in autonomously sensing the pose of non-cooperative spacecraft. This paper presents a novel overlay accelerator of DeepLab Convolutional Neural Networks (CNNs) for spacecraft image segmentation on a FPGA. First, several software\u2013hardware co-design aspects are investigated: (1) A CNNs-domain COD instruction set (Control, Operation, Data Transfer) is presented based on a Load\u2013Store architecture to enable the implementation of accelerator overlays. (2) An RTL-based prototype accelerator is developed for the COD instruction set. The accelerator incorporates dedicated units for instruction decoding and dispatch, scheduling, memory management, and operation execution. (3) A compiler is designed that leverages tiling and operation fusion techniques to optimize the execution of CNNs, generating binary instructions for the optimized operations. Our accelerator is implemented on a Xilinx Virtex-7 XC7VX690T FPGA at 200 MHz. Experiments demonstrate that with INT16 quantization our accelerator achieves an accuracy (mIoU) of 77.84%, experiencing only a 0.2% degradation compared to that of the original fully precision model, in accelerating the segmentation model of DeepLabv3+ ResNet18 on the spacecraft component images (SCIs) dataset. The accelerator boasts a performance of 184.19 GOPS\/s and a computational efficiency (Runtime Throughput\/Theoretical Roof Throughput) of 88.72%. Compared to previous work, our accelerator improves performance by 1.5\u00d7 and computational efficiency by 43.93%, all while consuming similar hardware resources. Additionally, in terms of instruction encoding, our instructions reduce the size by 1.5\u00d7 to 49\u00d7 when compiling the same model compared to previous work.<\/jats:p>","DOI":"10.3390\/rs16050894","type":"journal-article","created":{"date-parts":[[2024,3,4]],"date-time":"2024-03-04T10:11:57Z","timestamp":1709547117000},"page":"894","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["An Overlay Accelerator of DeepLab CNN for Spacecraft Image Segmentation on FPGA"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-3104-0737","authenticated-orcid":false,"given":"Zibo","family":"Guo","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kai","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wei","family":"Liu","sequence":"additional","affiliation":[{"name":"Smart Earth Key Laboratory, Beijing 100094, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaoyao","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chongyang","family":"Ding","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Shangrong","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2024,3,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3311","DOI":"10.1109\/TIE.2016.2530789","article-title":"A Review on Recent Development of Spacecraft Attitude Fault Tolerant Control System","volume":"63","author":"Yin","year":"2016","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Uriot, T., Izzo, D., Sim\u00f5es, L.F., Abay, R., Einecke, N., Rebhan, S., Martinez-Heras, J., Letizia, F., Siminski, J., and Merz, K. (2020). Spacecraft Collision Avoidance Challenge: Design and results of a machine learning competition. arXiv.","DOI":"10.1007\/s42064-021-0101-5"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1093\/mnras\/staa1463","article-title":"Machine learning classification of new asteroid families members","volume":"496","author":"Carruba","year":"2020","journal-title":"Mon. Not. R. Astron. Soc."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"448","DOI":"10.1016\/j.actaastro.2016.06.018","article-title":"RemoveDEBRIS: An in-orbit active debris removal demonstration mission","volume":"127","author":"Forshaw","year":"2016","journal-title":"Acta Astronaut."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Dung, H.A., Chen, B., and Chin, T.J. (2021, January 19\u201325). A Spacecraft Dataset for Detection, Segmentation and Parts Recognition. Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA.","DOI":"10.1109\/CVPRW53098.2021.00229"},{"key":"ref_6","unstructured":"Black, K., Shankar, S., Fonseka, D., Deutsch, J., Dhir, A., and Akella, M.R. (2021). Real-Time, Flight-Ready, Non-Cooperative Spacecraft Pose Estimation Using Monocular Imagery. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1007\/s11263-007-0109-1","article-title":"Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context","volume":"81","author":"Shotton","year":"2009","journal-title":"Int. J. Comput. Vis."},{"key":"ref_8","unstructured":"Ladick\u1ef3, L., Russell, C., Kohli, P., and Torr, P.H. (October, January 27). Associative hierarchical crfs for object class image segmentation. Proceedings of the International Conference on Computer Vision(ICCV), Kyoto, Japan."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Liu, Y., Zhu, M., Wang, J., Guo, X., Yang, Y., and Wang, J. (2022). Multi-Scale Deep Neural Network Based on Dilated Convolution for Spacecraft Image Segmentation. Sensors, 22.","DOI":"10.3390\/s22114222"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland. Proceedings, Part V 13.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Petrick, D., Geist, A., Albaijes, D., Davis, M., Sparacino, P., Crum, G., Ripley, R., Boblitt, J., and Flatley, T. (2014, January 1\u20138). SpaceCube v2.0 space flight hybrid reconfigurable data processing system. Proceedings of the IEEE the Aerospace Conference, Big Sky, MT, USA.","DOI":"10.1109\/AERO.2014.6836226"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shen, J., Wang, D., Huang, Y., Wen, M., and Zhang, C. (2019, January 2\u20136). Scale-out Acceleration for 3D CNN-based Lung Nodule Segmentation on a Multi-FPGA System. Proceedings of the Design Automation Conference (DAC), Las Vegas, NV, USA.","DOI":"10.1145\/3316781.3317906"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"704","DOI":"10.1109\/TCSI.2020.3038139","article-title":"Roadnet-rt: High throughput cnn architecture and soc design for real-time road segmentation","volume":"68","author":"Bai","year":"2020","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_16","first-page":"1","article-title":"Optimizing CNN-Based Segmentation with Deeply Customized Convolutional and Deconvolutional Architectures on FPGA","volume":"11","author":"Liu","year":"2018","journal-title":"ACM Trans. Reconfig. Technol. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, S., and Luk, W. (2019, January 8\u201312). Towards an Efficient Accelerator for DNN-Based Remote Sensing Image Segmentation on FPGAs. Proceedings of the International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain.","DOI":"10.1109\/FPL.2019.00037"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1109\/TCSI.2021.3131581","article-title":"A Flexible and Efficient FPGA Accelerator for Various Large-Scale and Lightweight CNNs","volume":"69","author":"Wu","year":"2022","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_19","unstructured":"Adam, P., Abhishek, C., Sangpil, K., and Eugenio, C. (2016). ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_21","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_23","unstructured":"(2023, October 05). SCIs Segmentation Dataset. Available online: https:\/\/github.com\/ZiBoGuo\/SCIs-Dataset."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Mor\u00ec, P., Vemparala, M.R., Fasfous, N., Mitra, S., Sarkar, S., Frickenstein, A., Frickenstein, L., Helms, D., Nagaraja, N.S., and Stechele, W. (2022, January 10\u201314). Accelerating and pruning CNNs for semantic segmentation on FPGA. Proceedings of the 59th ACM\/IEEE Design Automation Conference, San Francisco, CA, USA.","DOI":"10.1145\/3489517.3530424"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"3471","DOI":"10.1109\/TCSI.2020.2991189","article-title":"DT-CNN: An energy-efficient dilated and transposed convolutional neural network processor for region of interest based image segmentation","volume":"67","author":"Im","year":"2020","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40537-021-00444-8","article-title":"Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions","volume":"8","author":"Alzubaidi","year":"2021","journal-title":"J. Big Data"},{"key":"ref_27","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the International Conference on Machine Learning, Lille, France."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1861","DOI":"10.1109\/TVLSI.2019.2905242","article-title":"A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection","volume":"27","author":"Nguyen","year":"2019","journal-title":"IEEE Trans. Very Large Scale Integr. (VLSI) Syst."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1145\/1498765.1498785","article-title":"Roofline: An insightful visual performance model for multicore architectures","volume":"52","author":"Williams","year":"2009","journal-title":"Commun. ACM"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, S., Du, Z., Tao, J., Han, D., Luo, T., Xie, Y., Chen, Y., and Chen, T. (2016, January 18\u201322). Cambricon: An Instruction Set Architecture for Neural Networks. Proceedings of the Annual International Symposium on Computer Architecture (ISCA), Seoul, Republic of Korea.","DOI":"10.1109\/ISCA.2016.42"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1109\/TVLSI.2019.2939726","article-title":"OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks","volume":"28","author":"Yu","year":"2020","journal-title":"IEEE Trans. Very Large Scale Integr. (VLSI) Syst."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3283452","article-title":"Instruction driven cross-layer cnn accelerator for fast detection on fpga","volume":"11","author":"Yu","year":"2018","journal-title":"ACM Trans. Reconfig. Technol. Syst. (TRETS)"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2668","DOI":"10.1109\/TCAD.2019.2930577","article-title":"Dnnvm: End-to-end compiler leveraging heterogeneous optimizations on fpga-based cnn accelerators","volume":"39","author":"Xing","year":"2019","journal-title":"IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_34","unstructured":"(2024, January 02). Vitis AI Library User Guide (UG1354). Available online: https:\/\/docs.xilinx.com\/r\/1.4.1-English\/ug1354-xilinx-ai-sdk\/ZCU102-Evaluation-Kit."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Cong, J., Wei, P., Yu, C.H., and Zhang, P. (2018, January 24\u201328). Automated accelerator generation and optimization with composable, parallel and pipeline architecture. Proceedings of the ACM\/ESDA\/IEEE Design Automation Conference (DAC), IEEE, San Francisco, CA, USA.","DOI":"10.1109\/DAC.2018.8465940"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Qiu, J., Wang, J., Yao, S., Guo, K., Li, B., Zhou, E., Yu, J., Tang, T., Xu, N., and Song, S. (2016, January 21\u201324). Going Deeper with Embedded FPGA Platform for Convolutional Neural Network. Proceedings of the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, CA, USA.","DOI":"10.1145\/2847263.2847265"},{"key":"ref_37","unstructured":"Wu, D., Tang, Q., Zhao, Y., Zhang, M., Fu, Y., and Zhang, D. (2020). EasyQuant: Post-training Quantization via Scale Optimization. arXiv."},{"key":"ref_38","unstructured":"Liu, W., Rabinovich, A., and Berg, A.C. (2015). Parsenet: Looking wider to see better. arXiv."},{"key":"ref_39","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_41","unstructured":"Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., and Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"2032924","DOI":"10.1080\/08839514.2022.2032924","article-title":"A survey on deep learning-based architectures for semantic segmentation on 2d images","volume":"36","author":"Ulku","year":"2022","journal-title":"Appl. Artif. Intell."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Hu, Y., Liang, S., Yu, J., Wang, Y., and Yang, H. (2019, January 15\u201317). On-chip instruction generation for cross-layer CNN accelerator on FPGA. Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Miami, FL, USA.","DOI":"10.1109\/ISVLSI.2019.00011"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Friedrich, S., Sampath, S.B., Wittig, R., Vemparala, M.R., Fasfous, N., Mat\u00fa\u0161, E., Stechele, W., and Fettweis, G. (2023, January 5\u20137). Lightweight instruction set for flexible dilated convolutions and mixed-precision operands. Proceedings of the 2023 24th International Symposium on Quality Electronic Design (ISQED), San Francisco, CA, USA.","DOI":"10.1109\/ISQED57927.2023.10129341"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"326","DOI":"10.1109\/TNNLS.2018.2844093","article-title":"fpgaConvNet: Mapping regular and irregular convolutional neural networks on FPGAs","volume":"30","author":"Venieris","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1109\/TCAD.2017.2705069","article-title":"Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA","volume":"37","author":"Guo","year":"2018","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"2072","DOI":"10.1109\/TCAD.2017.2785257","article-title":"Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks","volume":"38","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3570928","article-title":"FlexCNN: An End-to-End Framework for Composing CNN Accelerators on FPGA","volume":"16","author":"Basalama","year":"2023","journal-title":"ACM Trans. Reconfig. Technol. Syst."},{"key":"ref_50","unstructured":"(2024, January 02). Zynq DPU Product Guide (PG338). Available online: https:\/\/docs.xilinx.com\/r\/3.2-English\/pg338-dpu\/Advanced-Tab."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3460288","article-title":"FTT-NAS: Discovering fault-tolerant convolutional neural architecture","volume":"26","author":"Ning","year":"2021","journal-title":"ACM Trans. Des. Autom. Electron. Syst. TODAES"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/5\/894\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:08:30Z","timestamp":1760105310000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/5\/894"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,2]]},"references-count":51,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,3]]}},"alternative-id":["rs16050894"],"URL":"https:\/\/doi.org\/10.3390\/rs16050894","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,2]]}}}