{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T00:57:01Z","timestamp":1759971421329,"version":"build-2065373602"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"name":"Science and Technology Planning Project of Guangzhou","award":["2023B01J0007"],"award-info":[{"award-number":["2023B01J0007"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation","doi-asserted-by":"crossref","award":["62301165"],"award-info":[{"award-number":["62301165"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Basic and Applied Basic Research Foundation of Guangdong","award":["2022A1515110774"],"award-info":[{"award-number":["2022A1515110774"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>Currently, there are two mainstream acceleration methods; one is mixed precision and the other is sparsity. Few accelerators support both mixed precision and sparsity, and most enable precision configurations across layers rather than within a single layer. Furthermore, most of accelerators adopt the traditional two\u2019s complement (2C) data representation method, and we found that 2C brings many invalid \u201c1\u201d when representing signed data, which brings more resources overhead for mixed precision and many invalid operations for bit-level sparsity. Therefore, we propose a high-efficiency accelerator featuring a precision-scalable Sign-Magnitude Processing Element (SM-PE), which adopts a data representation method of SM and can flexibly support various precision calculations (2, 4, 8 bits) and bit-level sparsity. In addition, a dynamic quantization algorithm named DoReFaLike and a bit-level column sparsity (BLCS) technique are proposed to improve the efficiency of SM-PEs. Under the same accuracy constraint, the sparsity rate of the SM scheme is 3.5\u00d7 higher than that of the 2C format. The accelerator has been synthesized on a 55nm CMOS ASIC platform. When scaled to 28nm, experimental results show that the energy efficiency of the proposed accelerator reaches 15.50, 25.37, 101.54 TOPS\/W with 8-bit, 4-bit, and 2-bit input activations, respectively, and weights represented in sparse 8-bit precision, operating at 400 MHz. Compared to state-of-the-art accelerators, the proposed design achieves a performance improvement of 1.1\u00d7 to 3.9\u00d7.<\/jats:p>","DOI":"10.1145\/3767336","type":"journal-article","created":{"date-parts":[[2025,9,15]],"date-time":"2025-09-15T11:30:21Z","timestamp":1757935821000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["A Precision-Scalable Accelerator with Sign-Magnitude Representation and Dual Adder Trees"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1237-4945","authenticated-orcid":false,"given":"Xianghong","family":"Hu","sequence":"first","affiliation":[{"name":"School of Microelectronics, Guangdong University of Technology","place":["Guangzhou, China"]},{"name":"Company of Chipeye Microelectronics Foshan Ltd","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-7764-000X","authenticated-orcid":false,"given":"Chaoming","family":"Yang","sequence":"additional","affiliation":[{"name":"Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9700-4272","authenticated-orcid":false,"given":"Xueming","family":"Li","sequence":"additional","affiliation":[{"name":"School of Microelectronics, Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-1697-6556","authenticated-orcid":false,"given":"Rongfeng","family":"Li","sequence":"additional","affiliation":[{"name":"Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-9106-5989","authenticated-orcid":false,"given":"Yuanmiao","family":"Lin","sequence":"additional","affiliation":[{"name":"Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-9246-6116","authenticated-orcid":false,"given":"Shansen","family":"Fu","sequence":"additional","affiliation":[{"name":"Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8034-0616","authenticated-orcid":false,"given":"Hongmin","family":"Huang","sequence":"additional","affiliation":[{"name":"Guangdong Polytechnic Normal University","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2842-6439","authenticated-orcid":false,"given":"Shuting","family":"Cai","sequence":"additional","affiliation":[{"name":"Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2421-7621","authenticated-orcid":false,"given":"Xiaoming","family":"Xiong","sequence":"additional","affiliation":[{"name":"School of Microelectronics, Guangdong University of Technology","place":["Guangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,10,8]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICVRIS.2019.00049"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCMC56507.2023.10083750"},{"volume-title":"Proceedings of the 2020 International SoC Design Conference (ISOCC). 328\u2013329","author":"Rana A.","key":"e_1_3_1_4_2","unstructured":"A. Rana and K. K. Kim. 2020. A lightweight DNN for ECG image classification. In Proceedings of the 2020 International SoC Design Conference (ISOCC). 328\u2013329."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2020.3036783"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8682195"},{"volume-title":"Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4385\u20134389","author":"Feng X.","key":"e_1_3_1_7_2","unstructured":"X. Feng, B. Richardson, S. Amman, and J. Glass. 2015. On using heterogeneous data for vehicle-based speech recognition: A DNN-based approach. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4385\u20134389."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3677321"},{"volume-title":"Proceedings of the 2023 VTS Asia Pacific Wireless Communications Symposium (APWCS). 1\u20135.","author":"Miki Y.","key":"e_1_3_1_9_2","unstructured":"Y. Miki, K. Kobayashi, and W. Chujo. 2023. A study on data signal detection and demodulation based on object detection DNN for image sensor-based visible light communication. In Proceedings of the 2023 VTS Asia Pacific Wireless Communications Symposium (APWCS). 1\u20135."},{"volume-title":"Proceedings of the 2021 IEEE\/ACM International Conference on Automation of Software Test (AST). 80\u201389","author":"Kim S.","key":"e_1_3_1_10_2","unstructured":"S. Kim and S. Yoo. 2021. Multimodal surprise adequacy analysis of inputs for natural language processing DNN models. In Proceedings of the 2021 IEEE\/ACM International Conference on Automation of Software Test (AST). 80\u201389."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCCS55155.2022.9846198"},{"volume-title":"Proceedings of the 2023 Data Compression Conference (DCC). 356\u2013356","author":"Pankiv O.","key":"e_1_3_1_12_2","unstructured":"O. Pankiv and D. Puchala. 2023. Neural implementation of non-linear scalar quantization. In Proceedings of the 2023 Data Compression Conference (DCC). 356\u2013356."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2021.3129357"},{"key":"e_1_3_1_14_2","volume-title":"Proceedings of the 61st ACM\/IEEE Design Automation Conference. 1\u20136.","author":"Ziyi Guan","year":"2024","unstructured":"Guan Ziyi, Hantao Huang, Yupeng Su, Hong Huang, Ngai Wong, and Hao Yu. 2024. Aptq: Attention-aware post-training mixed-precision quantization for large language models. In Proceedings of the 61st ACM\/IEEE Design Automation Conference. 1\u20136."},{"volume-title":"Proceedings of the 2022 IEEE International Symposium on Circuits and Systems (ISCAS). 2871\u20132875","author":"Guo L.","key":"e_1_3_1_15_2","unstructured":"L. Guo, W. Fei, W. Dai, C. Li, J. Zou, and H. Xiong. 2022. Mixed-precision quantization of u-net for medical image segmentation. In Proceedings of the 2022 IEEE International Symposium on Circuits and Systems (ISCAS). 2871\u20132875."},{"volume-title":"Proceedings of the 2019 17th International Conference on Emerging eLearning Technologies and Applications (ICETA). 354\u2013360","author":"Kenyeres M.","key":"e_1_3_1_16_2","unstructured":"M. Kenyeres and J. Kenyeres. 2019. Generalized metropolis-hastings algorithm for distributed averaging with uniform quantization scheme. In Proceedings of the 2019 17th International Conference on Emerging eLearning Technologies and Applications (ICETA). 354\u2013360."},{"volume-title":"Proceedings of the 2024 5th International Conference on Computer Engineering and Application (ICCEA). 141\u2013144","author":"Zhu Z.","key":"e_1_3_1_17_2","unstructured":"Z. Zhu, S. Wang, H. Li, and X. Tian. 2024. Reducing neural networks quantization error through the change of feature correlation. In Proceedings of the 2024 5th International Conference on Computer Engineering and Application (ICCEA). 141\u2013144."},{"volume-title":"Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV). 293\u2013302","author":"Dong Z.","key":"e_1_3_1_18_2","unstructured":"Z. Dong, Z. Yao, A. Gholami, M. Mahoney, and K. Keutzer. 2019. HAWQ: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV). 293\u2013302."},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISOCC53507.2021.9613969"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2016.2597140"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2018.2865489"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE54114.2022.9774679"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2018.00069"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2022.3141050"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2020.2993051"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSII.2022.3231418"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2023.3310916"},{"volume-title":"Proceedings of the 2017 IEEE International Solid-State Circuits Conference (ISSCC). 246\u2013247","author":"Moons B.","key":"e_1_3_1_28_2","unstructured":"B. Moons, R. Uytterhoeven, W. Dehaene, and M. Verhelst. 2017. 14.5 Envision: A 0.26-to-10TOPS\/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm FDSOI. In Proceedings of the 2017 IEEE International Solid-State Circuits Conference (ISSCC). 246\u2013247."},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2022.3210069"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC42615.2023.10067269"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2025.3554523"},{"key":"e_1_3_1_32_2","volume-title":"Proceedings of the International Conference on Machine Learning. 11875\u201311886","author":"Yao Zhewei","year":"2021","unstructured":"Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael W. Mahoney, and Kurt Keutzer. 2021. HAWQ-V3: Dyadic neural network quantization. In Proceedings of the International Conference on Machine Learning. 11875\u201311886."},{"volume-title":"Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1512\u20131521","author":"Liu Y.","key":"e_1_3_1_33_2","unstructured":"Y. Liu, W. Zhang, and J. Wang. 2021. Zero-shot adversarial quantization. In Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1512\u20131521."},{"volume-title":"Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN). 2547\u20132554","author":"Alemdar H.","key":"e_1_3_1_34_2","unstructured":"H. Alemdar, V. Leroy, A. Prost-Boucle, and F. P\u00e9trot. 2017. Ternary neural networks for resource-efficient AI applications. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN). 2547\u20132554."},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2018.01.010"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE48585.2020.9116390"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISQED65160.2025.11014367"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3572917"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCASAI.2024.3520905"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2025.3550786"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2022.3197503"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2910232"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3579371.3589350"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3296957.3173176"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001139"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3767336","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,8]],"date-time":"2025-10-08T13:51:56Z","timestamp":1759931516000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3767336"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,8]]},"references-count":44,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3767336"],"URL":"https:\/\/doi.org\/10.1145\/3767336","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2025,10,8]]},"assertion":[{"value":"2025-01-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-30","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}