{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T15:00:00Z","timestamp":1777129200281,"version":"3.51.4"},"reference-count":63,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2022,12,13]],"date-time":"2022-12-13T00:00:00Z","timestamp":1670889600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Israel Innovation Authority"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Quantized neural networks (QNNs) are among the main approaches for deploying deep neural networks on low-resource edge devices. Training QNNs using different levels of precision throughout the network (mixed-precision quantization) typically achieves superior trade-offs between performance and computational load. However, optimizing the different precision levels of QNNs can be complicated, as the values of the bit allocations are discrete and difficult to differentiate for. Moreover, adequately accounting for the dependencies between the bit allocation of different layers is not straightforward. To meet these challenges, in this work, we propose GradFreeBits: a novel joint optimization scheme for training mixed-precision QNNs, which alternates between gradient-based optimization for the weights and gradient-free optimization for the bit allocation. Our method achieves a better or on par performance with the current state-of-the-art low-precision classification networks on CIFAR10\/100 and ImageNet, semantic segmentation networks on Cityscapes, and several graph neural networks benchmarks. Furthermore, our approach can be extended to a variety of other applications involving neural networks used in conjunction with parameters that are difficult to optimize for.<\/jats:p>","DOI":"10.3390\/s22249772","type":"journal-article","created":{"date-parts":[[2022,12,14]],"date-time":"2022-12-14T03:21:52Z","timestamp":1670988112000},"page":"9772","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["GradFreeBits: Gradient-Free Bit Allocation for Mixed-Precision Neural Networks"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8742-4081","authenticated-orcid":false,"given":"Benjamin Jacob","family":"Bodner","sequence":"first","affiliation":[{"name":"Department of Computer Science, Ben-Gurion University, Beer Sheva 8410501, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gil","family":"Ben-Shalom","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Ben-Gurion University, Beer Sheva 8410501, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5351-0966","authenticated-orcid":false,"given":"Eran","family":"Treister","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Ben-Gurion University, Beer Sheva 8410501, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,12,13]]},"reference":[{"key":"ref_1","unstructured":"Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020, January 6\u201312). Language models are few-shot learners. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, Virtual."},{"key":"ref_2","first-page":"6105","article-title":"EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks","volume":"Volume 97","author":"Tan","year":"2019","journal-title":"Proceedings of the International Conference on Machine Learning, ICML"},{"key":"ref_3","first-page":"491","article-title":"Big Transfer (BiT): General Visual Representation Learning","volume":"Volume 12350","author":"Kolesnikov","year":"2020","journal-title":"Proceedings of the European Conference on Computer Vision, ECCV"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1016\/j.eng.2020.01.007","article-title":"A survey of accelerator architectures for deep neural networks","volume":"6","author":"Chen","year":"2020","journal-title":"Engineering"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1109\/MSP.2017.2765695","article-title":"Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges","volume":"35","author":"Cheng","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_6","first-page":"129","article-title":"What is the state of neural network pruning?","volume":"2","author":"Blalock","year":"2020","journal-title":"Proc. Mach. Learn. Syst."},{"key":"ref_7","unstructured":"Cho, J.H., and Hariharan, B. (November, January 27). On the Efficacy of Knowledge Distillation. Proceedings of the International Conference on Computer Vision, ICCV, Seoul, Korea."},{"key":"ref_8","unstructured":"Tung, F., and Mori, G. (November, January 27). Similarity-Preserving Knowledge Distillation. Proceedings of the International Conference on Computer Vision, ICCV, Seoul, Korea."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"76:1","DOI":"10.1145\/3447582","article-title":"A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions","volume":"54","author":"Ren","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_10","first-page":"187:1","article-title":"Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations","volume":"18","author":"Hubara","year":"2017","journal-title":"J. Mach. Learn. Res."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"568","DOI":"10.1109\/TPAMI.2018.2886192","article-title":"Deep Neural Network Compression by In-Parallel Pruning-Quantization","volume":"42","author":"Tung","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","unstructured":"Banner, R., Nahshan, Y., and Soudry, D. (2019, January 8\u201314). Post training 4-bit quantization of convolutional networks for rapid-deployment. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, Vancouver, BC, Canada."},{"key":"ref_13","unstructured":"Choi, J., Venkataramani, S., Srinivasan, V., Gopalakrishnan, K., Wang, Z., and Chuang, P. (April, January 31). Accurate and Efficient 2-bit Quantized Neural Networks. Proceedings of the Machine Learning and Systems, Stanford, CA, USA."},{"key":"ref_14","unstructured":"Zhao, R., Hu, Y., Dotzel, J., Sa, C.D., and Zhang, Z. (2019, January 9\u201315). Improving Neural Network Quantization without Retraining using Outlier Channel Splitting. Proceedings of the International Conference on Machine Learning, ICML, Long Beach, CA, USA."},{"key":"ref_15","unstructured":"Bai, H., Cao, M., Huang, P., and Shan, J. (2021, January 19). BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, Virtual."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Liu, Y., Zhang, W., and Wang, J. (2021, January 19\u201325). Zero-Shot Adversarial Quantization. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Virtual.","DOI":"10.1109\/CVPR46437.2021.00156"},{"key":"ref_17","unstructured":"Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., and Zou, Y. (2016). DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. arXiv."},{"key":"ref_18","unstructured":"Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I., Srinivasan, V., and Gopalakrishnan, K. (2018). PACT: Parameterized Clipping Activation for Quantized Neural Networks. arXiv."},{"key":"ref_19","unstructured":"Jin, Q., Yang, L., Liao, Z., and Qian, X. (2020, January 7\u201310). Neural Network Quantization with Scale-Adjusted Training. Proceedings of the British Machine Vision Conference, BMVC, Virtual Event, UK."},{"key":"ref_20","unstructured":"Cai, W., and Li, W. (2019). Weight Normalization based Quantization for Deep Neural Network Compression. arXiv."},{"key":"ref_21","first-page":"373","article-title":"LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks","volume":"Volume 11212","author":"Zhang","year":"2018","journal-title":"Proceedings of the European Conference on Computer Vision, ECCV"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1247","DOI":"10.1007\/s10898-012-9951-y","article-title":"Derivative-free optimization: A review of algorithms and comparison of software implementations","volume":"56","author":"Rios","year":"2013","journal-title":"J. Glob. Optim."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1162\/106365603321828970","article-title":"Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES)","volume":"11","author":"Hansen","year":"2003","journal-title":"Evol. Comput."},{"key":"ref_24","unstructured":"Li, Y., Dong, X., and Wang, W. (2020, January 26\u201330). Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks. Proceedings of the International Conference on Learning Representations, ICLR, Addis Ababa, Ethiopia."},{"key":"ref_25","unstructured":"Gong, R., Liu, X., Jiang, S., Li, T., Hu, P., Lin, J., Yu, F., and Yan, J. (November, January 29). Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks. Proceedings of the International Conference on Computer Vision, ICCV, Seoul, Korea."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s40687-018-0177-6","article-title":"Blended coarse gradient descent for full quantization of deep neural networks","volume":"6","author":"Yin","year":"2019","journal-title":"Res. Math. Sci."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Yamamoto, K. (2021, January 20\u201325). Learnable Companding Quantization for Accurate Low-Bit Neural Networks. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Virtual.","DOI":"10.1109\/CVPR46437.2021.00499"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wang, L., Dong, X., Wang, Y., Liu, L., An, W., and Guo, Y. (2022, January 18\u201324). Learnable Lookup Table for Neural Network Quantization. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01210"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1109\/MM.2020.3009475","article-title":"ReLeQ: A Reinforcement Learning Approach for Automatic Deep Quantization of Neural Networks","volume":"40","author":"Elthakeb","year":"2020","journal-title":"IEEE Micro"},{"key":"ref_30","unstructured":"Wang, K., Liu, Z., Lin, Y., Lin, J., and Han, S. (201, January 15\u201320). HAQ: Hardware-Aware Automated Quantization With Mixed Precision. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Long Beach, CA, USA."},{"key":"ref_31","unstructured":"Dong, Z., Yao, Z., Gholami, A., Mahoney, M.W., and Keutzer, K. (November, January 27). HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision. Proceedings of the International Conference on Computer Vision, ICCV, Seoul, Korea."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Dong, Z., Yao, Z., Arfeen, D., Gholami, A., Mahoney, M.W., and Keutzer, K. (2020, January 6\u201312). HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, Virtual.","DOI":"10.1109\/ICCV.2019.00038"},{"key":"ref_33","unstructured":"Uhlich, S., Mauch, L., Cardinaux, F., Yoshiyama, K., Garc\u00eda, J.A., Tiedemann, S., Kemp, T., and Nakamura, A. (2020, January 26\u201330). Mixed Precision DNNs: All you need is a good parametrization. Proceedings of the International Conference on Learning Representations, ICLR, Addis Ababa, Ethiopia."},{"key":"ref_34","unstructured":"Li, Y., Wang, W., Bai, H., Gong, R., Dong, X., and Yu, F. (2020). Efficient Bitwidth Search for Practical Mixed Precision Neural Network. arXiv."},{"key":"ref_35","first-page":"544","article-title":"Single Path One-Shot Neural Architecture Search with Uniform Sampling","volume":"Volume 12361","author":"Guo","year":"2020","journal-title":"Proceedings of the European Conference on Computer Vision, ECCV"},{"key":"ref_36","first-page":"1","article-title":"Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization","volume":"Volume 12354","author":"Yu","year":"2020","journal-title":"Proceedings of the European Conference on Computer Vision, ECCV"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Shen, M., Liang, F., Gong, R., Li, Y., Li, C., Lin, C., Yu, F., Yan, J., and Ouyang, W. (2021, January 10\u201317). Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search. Proceedings of the International Conference on Computer Vision, ICCV, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00529"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wang, T., Wang, K., Cai, H., Lin, J., Liu, Z., Wang, H., Lin, Y., and Han, S. (2020, January 13\u201319). APQ: Joint Search for Network Architecture, Pruning and Quantization Policy. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00215"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"18327","DOI":"10.1007\/s00521-020-04969-6","article-title":"Automated design of error-resilient and hardware-efficient deep neural networks","volume":"32","author":"Schorn","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Hu, P., Peng, X., Zhu, H., Aly, M.M.S., and Lin, J. (2021, January 2\u20139). OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization. Proceedings of the Thirty-Fifth Conference on Artificial Intelligence, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, The Eleventh Symposium on Educational Advances in Artificial Intelligence, Virtual Event.","DOI":"10.1609\/aaai.v35i9.16950"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Yang, H., Gui, S., Zhu, Y., and Liu, J. (2020, January 13\u201319). Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00225"},{"key":"ref_42","unstructured":"Bengio, Y., L\u00e9onard, N., and Courville, A.C. (2013). Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation. arXiv."},{"key":"ref_43","first-page":"428","article-title":"Evolution Strategies for Direct Policy Search","volume":"Volume 5199","author":"Igel","year":"2008","journal-title":"Proceedings of the Parallel Problem Solving from Nature"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1016\/j.ins.2020.03.112","article-title":"A hybrid cooperative co-evolution algorithm framework for optimising power take off and placements of wave energy converters","volume":"534","author":"Neshat","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_45","unstructured":"Loshchilov, I., and Hutter, F. (2016). CMA-ES for Hyperparameter Optimization of Deep Neural Networks. arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hansen, N. (2009, January 8\u201312). Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed. Proceedings of the Genetic and Evolutionary Computation Conference, Montreal, QC, Canada.","DOI":"10.1145\/1570256.1570333"},{"key":"ref_47","unstructured":"Johnson, R., and Zhang, T. (2013, January 5\u201310). Accelerating Stochastic Gradient Descent using Predictive Variance Reduction. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, Lake Tahoe, NV, USA."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Marcel, S., and Rodriguez, Y. (2010). Torchvision the machine-vision package of torch. Int. Conf. Multimed., 1485\u20131488.","DOI":"10.1145\/1873951.1874254"},{"key":"ref_49","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, Vancouver, BC, Canada."},{"key":"ref_50","unstructured":"Krizhevsky, A., and Hinton, G. (2009). Learning multiple layers of features from tiny images. Neural Inf. Process. Syst."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"ImageNet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The Cityscapes Dataset for Semantic Urban Scene Understanding. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_53","unstructured":"Zhang, H., Ciss\u00e9, M., Dauphin, Y.N., and Lopez-Paz, D. (May, January 30). mixup: Beyond Empirical Risk Minimization. Proceedings of the International Conference on Learning Representations, ICLR, Vancouver, BC, Canada."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"2038","DOI":"10.1109\/TPAMI.2019.2907634","article-title":"Towards Efficient U-Nets: A Coupled and Quantized Approach","volume":"42","author":"Tang","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_55","unstructured":"Chen, L., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_56","first-page":"833","article-title":"Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation","volume":"Volume 11211","author":"Chen","year":"2018","journal-title":"Proceedings of the European Conference on Computer Vision, ECCV"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., and Chen, L. (2018, January 18\u201322). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the Conference on Computer Vision and Pattern Recognition, CVPR, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_59","unstructured":"Eliasof, M., Bodner, B.J., and Treister, E. (2021). Haar Wavelet Feature Compression for Quantized Graph Convolutional Networks. arXiv."},{"key":"ref_60","unstructured":"Chen, M., Wei, Z., Huang, Z., Ding, B., and Li, Y. (July, January 29). Simple and Deep Graph Convolutional Networks. Proceedings of the International Conference on Machine Learning, ICML, Virtual Event."},{"key":"ref_61","unstructured":"Yang, Z., Cohen, W.W., and Salakhutdinov, R. (2016, January 19\u201324). Revisiting Semi-Supervised Learning with Graph Embeddings. Proceedings of the 33nd International Conference on Machine Learning, ICML, New York City, NY, USA."},{"key":"ref_62","unstructured":"Fey, M., and Lenssen, J.E. (2019). Fast Graph Representation Learning with PyTorch Geometric. arXiv."},{"key":"ref_63","unstructured":"Hansen, N. (2016). The CMA Evolution Strategy: A Tutorial. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/24\/9772\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:40:24Z","timestamp":1760146824000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/24\/9772"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,13]]},"references-count":63,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["s22249772"],"URL":"https:\/\/doi.org\/10.3390\/s22249772","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,13]]}}}