{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:01:28Z","timestamp":1760238088173,"version":"build-2065373602"},"reference-count":34,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2020,7,17]],"date-time":"2020-07-17T00:00:00Z","timestamp":1594944000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Visual semantic segmentation, which is represented by the semantic segmentation network, has been widely used in many fields, such as intelligent robots, security, and autonomous driving. However, these Convolutional Neural Network (CNN)-based networks have high requirements for computing resources and programmability for hardware platforms. For embedded platforms and terminal devices in particular, Graphics Processing Unit (GPU)-based computing platforms cannot meet these requirements in terms of size and power consumption. In contrast, the Field Programmable Gate Array (FPGA)-based hardware system not only has flexible programmability and high embeddability, but can also meet lower power consumption requirements, which make it an appropriate solution for semantic segmentation on terminal devices. In this paper, we demonstrate EDSSA\u2014an Encoder-Decoder semantic segmentation networks accelerator architecture which can be implemented with flexible parameter configurations and hardware resources on the FPGA platforms that support Open Computing Language (OpenCL) development. We introduce the related technologies, architecture design, algorithm optimization, and hardware implementation of the Encoder-Decoder semantic segmentation network SegNet as an example, and undertake a performance evaluation. Using an Intel Arria-10 GX1150 platform for evaluation, our work achieves a throughput higher than 432.8 GOP\/s with power consumption of about 20 W, which is a 1.2\u00d7 times improvement the energy-efficiency ratio compared to a high-performance GPU.<\/jats:p>","DOI":"10.3390\/s20143969","type":"journal-article","created":{"date-parts":[[2020,7,17]],"date-time":"2020-07-17T10:22:02Z","timestamp":1594981322000},"page":"3969","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["EDSSA: An Encoder-Decoder Semantic Segmentation Networks Accelerator on OpenCL-Based FPGA Platform"],"prefix":"10.3390","volume":"20","author":[{"given":"Hongzhi","family":"Huang","sequence":"first","affiliation":[{"name":"School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yakun","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mengqi","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering and BNRist, Tsinghua University, Beijing 100084, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3880-4501","authenticated-orcid":false,"given":"Xuesong","family":"Shi","sequence":"additional","affiliation":[{"name":"Intel Labs China, Beijing 100090, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5054-9590","authenticated-orcid":false,"given":"Fei","family":"Qiao","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering and BNRist, Tsinghua University, Beijing 100084, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li","family":"Luo","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3189-7562","authenticated-orcid":false,"given":"Qi","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Precision Instrument, Tsinghua University, Beijing 100084, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinjun","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering, Tsinghua University, Beijing 100084, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,7,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Miyamoto, R., Adachi, M., Nakamura, Y., Nakajima, T., Ishida, H., and Kobayashi, S. (2019, January 23\u201326). Accuracy Improvement of Semantic Segmentation Using Appropriate Datasets for Robot Navigation. Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), Paris, France.","DOI":"10.1109\/CoDIT.2019.8820616"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kim, W., and Seok, J. (2018, January 3\u20136). Indoor Semantic Segmentation for Robot Navigating on Mobile. Proceedings of the International Conference on Ubiquitous and Future Networks (ICUFN), Prague, Czech Republic.","DOI":"10.1109\/ICUFN.2018.8436956"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1080\/01691864.2014.1003096","article-title":"Understanding the intention of human activities through semantic perception: Observation, understanding and execution on a humanoid robot","volume":"29","author":"Beetz","year":"2015","journal-title":"Adv. Robot."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., and Harada, T. (2017, January 24\u201328). MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8206396"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Siam, M., Gamal, M., Abdel-Razek, M., Yogamani, S., Jagersand, M., and Zhang, H. (2018, January 18\u201322). A Comparative Study of Real-time Semantic Segmentation for Autonomous Driving. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00101"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"64","DOI":"10.3389\/fnbot.2018.00064","article-title":"Faster R-CNN for Robust Pedestrian Detection Using Semantic Segmentation Network","volume":"12","author":"Liu","year":"2018","journal-title":"Front. Neurorobotics"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1096","DOI":"10.1016\/j.robot.2010.05.004","article-title":"Hybrid robot control and SLAM for persistent navigation and mapping","volume":"58","author":"Milford","year":"2010","journal-title":"Robot. Auton. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhang, F., Li, S.Q., Yuan, S., Sun, E.Z., and Zhao, L.G. (2017, January 10\u201312). Algorithms Analysis of Mobile Robot SLAM based on Kalman and Particle Filter. Proceedings of the 9th International Conference on Modelling, Identification and Control (ICMIC), Kunming, China.","DOI":"10.1109\/ICMIC.2017.8321612"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1106","DOI":"10.1109\/JSSC.2018.2886342","article-title":"Navion: A 2-mW Fully Integrated Real-Time Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones","volume":"54","author":"Suleiman","year":"2019","journal-title":"IEEE J. Solid State Circuits"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Liu, R.Z., Yang, J.L., Chen, Y.R., and Zhao, W.S. (2019, January 2\u20136). eSLAM: An Energy-Efficient Accelerator for Real-Time ORB-SLAM on FPGA Platform. Proceedings of the 56th ACM\/EDAC\/IEEE Design Automation Conference (DAC), Las Vegas, NV, USA.","DOI":"10.1145\/3316781.3317820"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Liu, S.S., Tsai, G., Hu, H.B., Chu, C.C., and Zheng, F. (2018, January 21\u201325). PIRVS: An Advanced Visual-Inertial SLAM System with Flexible Sensor Fusion and Hardware Co-Design. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia.","DOI":"10.1109\/ICRA.2018.8460672"},{"key":"ref_12","first-page":"1097","article-title":"ImageNet Classification with Deep Convolutional Neural Networks","volume":"1","author":"Krizhevsky","year":"2012","journal-title":"Neural Inf. Process. Syst."},{"key":"ref_13","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"He, K.M., Zhang, X.Y., Ren, S.Q., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","first-page":"234","article-title":"U-Net: Convolutional Networks for Biomedical Image Segmentation","volume":"9351","author":"Ronneberger","year":"2015","journal-title":"Med. Image Comput. Comput. Assist. Interv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yu, C., Liu, Z.X., Liu, X.J., Xie, F.G., Yang, Y., Wei, Q., and Qiao, F. (2018, January 1\u20135). DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments. Proceedings of the 25th IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593691"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Brenot, F., Piat, J., and Fillatreau, P. (2016, January 12\u201315). FPGA based hardware acceleration of a BRIEF correlator module for a monocular SLAM application. Proceedings of the 10th International Conference on Distributed Smart Cameras (ICDSC), Paris, France.","DOI":"10.1145\/2967413.2967426"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Li, Z.Y., Chen, Y., Gong, L.Y., Liu, L., Sylvester, D., Blaauw, D., and Kim, H.S. (2019, January 17\u201321). An 879GOPS 243mW 80fps VGA Fully Visual CNN-SLAM Processor for Wide-Range Autonomous Exploration. Proceedings of the IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA.","DOI":"10.1109\/ISSCC.2019.8662397"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, C., Li, P., Sun, G.Y., Guan, Y.J., Xiao, B.J., and Cong, J. (2015, January 22\u201324). Optimizing FPGA-based accelerator design for deep convolutional neural networks. Proceedings of the 2015 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, CA, USA.","DOI":"10.1145\/2684746.2689060"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3131289","article-title":"PLACID: A Platform for FPGA-Based Accelerator Creation for DCNNs","volume":"13","author":"Motamedi","year":"2017","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_23","unstructured":"Li, H.M., Fan, X.T., Jiao, L., Cao, W., Zhou, X.G., and Wang, L.L. (September, January 29). A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks. Proceedings of the 26th International Conference on Field-Programmable Logic and Applications (FPL), Lausanne, Switzerland."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zhang, J.L., and Li, J. (2017, January 22\u201324). Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural. Proceedings of the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Network, Monterey, CA, USA.","DOI":"10.1145\/3020078.3021698"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Aydonat, U., O\u2019Connell, S., Capalija, D., Ling, A.C., and Chiu, G.R. (2017, January 22\u201324). An OpenCL(TM) Deep Learning Accelerator on Arria 10. Proceedings of the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, CA, USA.","DOI":"10.1145\/3020078.3021738"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Wang, D., Xu, K., and Jiang, D.K. (2017, January 11\u201313). PipeCNN: An OpenCL-based open-source FPGA accelerator for convolution neural networks. Proceedings of the 2017 International Conference on Field Programmable Technology (ICFPT), Melbourne, VIC, Australia.","DOI":"10.1109\/FPT.2017.8280160"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Qiu, J.T., Wang, J., Yao, S., Guo, K.Y., Li, B.X., Zhou, E.J., Yu, J.C., Tang, T.Q., Xu, N.Y., and Song, S. (2016, January 21\u201323). Going Deeper with Embedded FPGA Platform for Convolutional Neural Network. Proceedings of the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, CA, USA.","DOI":"10.1145\/2847263.2847265"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhang, C., and Prasanna, V. (2017, January 22\u201324). Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System. Proceedings of the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, CA, USA.","DOI":"10.1145\/3020078.3021727"},{"key":"ref_29","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Jia, Y.Q., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014, January 3\u20137). Caffe: Convolutional Architecture for Fast Feature Embedding. Proceedings of the ACM Conference on Multimedia (MM), Orlando, FL, USA.","DOI":"10.1145\/2647868.2654889"},{"key":"ref_31","unstructured":"Yu, M.Q., Huang, H.Z., Liu, H., He, S.Y., Qiao, F., Luo, L., Xie, F.G., Liu, X.J., and Yang, H.Z. (August, January 29). Optimizing FPGA-based Convolutional Encoder-Decoder Architecture for Semantic Segmentation. Proceedings of the 9th IEEE Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Suzhou, China."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Shi, X.S., Cao, L., Wang, D.W., Liu, L., You, G.M., Liu, S., and Wang, C. (2018, January 1\u20135). HERO: Accelerating Autonomous Robotic Tasks with FPGA. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593522"},{"key":"ref_33","unstructured":"Alexgkendall (2020, April 16). Segnet Model File: Segnet_Pascal.prototxt, Pascal VOC, SegNet Model Zoo. Available online: https:\/\/github.com\/alexgkendall\/SegNet-Tutorial\/blob\/master\/Example_Models\/segnet_model_zoo.md."},{"key":"ref_34","unstructured":"(2020, April 16). Intel. Available online: https:\/\/ark.intel.com\/content\/www\/cn\/zh\/ark\/products\/65732\/intel-xeon-processor-e3-1230-v2-8m-cache-3-30-ghz.html?wapkw=e3%201230%20v2&erpm_id=5831403."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/14\/3969\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:49:15Z","timestamp":1760176155000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/14\/3969"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,17]]},"references-count":34,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2020,7]]}},"alternative-id":["s20143969"],"URL":"https:\/\/doi.org\/10.3390\/s20143969","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,7,17]]}}}