{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T18:18:01Z","timestamp":1772907481280,"version":"3.50.1"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7]]},"abstract":"<jats:p>We introduce a Power-of-Two post-training quantization( PTQ) method for deep neural network that meets hardware requirements and does not call for long-time retraining. PTQ requires a small set of calibration data and is easier for deployment, but results in lower accuracy than Quantization-Aware Training( QAT). Power-of-Two quantization can convert the multiplication introduced by quantization and dequantization to bit-shift that is adopted by many efficient accelerators. However, the Power-of-Two scale has fewer candidate values, which leads to more rounding or clipping errors. We propose a novel Power-of-Two PTQ framework, dubbed RAPQ, which dynamically adjusts the Power-of-Two scales of the whole network instead of statically determining them layer by layer. It can theoretically trade off the rounding error and clipping error of the whole network. Meanwhile, the reconstruction method in RAPQ is based on the BN information of every unit. Extensive experiments on ImageNet prove the excellent performance of our proposed method. Without bells and whistles, RAPQ can reach accuracy of 65% and 48% on ResNet-18 and MobileNetV2 respectively with  weight INT2 activation INT4. We are the first to propose PTQ for the more constrained but hardware-friendly Power-of-Two quantization and prove that it can achieve nearly the same accuracy as SOTA PTQ method. The code will be released.<\/jats:p>","DOI":"10.24963\/ijcai.2022\/219","type":"proceedings-article","created":{"date-parts":[[2022,7,16]],"date-time":"2022-07-16T02:55:56Z","timestamp":1657940156000},"page":"1573-1579","source":"Crossref","is-referenced-by-count":8,"title":["RAPQ: Rescuing Accuracy for Power-of-Two Low-bit  Post-training Quantization"],"prefix":"10.24963","author":[{"given":"Hongyi","family":"Yao","sequence":"first","affiliation":[{"name":"Peking University"}]},{"given":"Pu","family":"Li","sequence":"additional","affiliation":[{"name":"Peking University"}]},{"given":"Jian","family":"Cao","sequence":"additional","affiliation":[{"name":"Peking University"}]},{"given":"Xiangcheng","family":"Liu","sequence":"additional","affiliation":[{"name":"Peking University"}]},{"given":"Chenying","family":"Xie","sequence":"additional","affiliation":[{"name":"Peking University"}]},{"given":"Bingzhang","family":"Wang","sequence":"additional","affiliation":[{"name":"Peking University"}]}],"member":"10584","event":{"name":"Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}","theme":"Artificial Intelligence","location":"Vienna, Austria","acronym":"IJCAI-2022","number":"31","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"start":{"date-parts":[[2022,7,23]]},"end":{"date-parts":[[2022,7,29]]}},"container-title":["Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2022,7,18]],"date-time":"2022-07-18T11:08:25Z","timestamp":1658142505000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2022\/219"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2022,7]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2022\/219","relation":{},"subject":[],"published":{"date-parts":[[2022,7]]}}}