{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T18:59:49Z","timestamp":1771700389187,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":62,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,6,17]],"date-time":"2023-06-17T00:00:00Z","timestamp":1686960000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["CCF-1937403"],"award-info":[{"award-number":["CCF-1937403"]}]},{"name":"NSF","award":["CCF-1955909"],"award-info":[{"award-number":["CCF-1955909"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,6,17]]},"DOI":"10.1145\/3579371.3589103","type":"proceedings-article","created":{"date-parts":[[2023,6,16]],"date-time":"2023-06-16T20:25:28Z","timestamp":1686947128000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["ETTE: Efficient Tensor-Train-based Computing Engine for Deep Neural Networks"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5465-9044","authenticated-orcid":false,"given":"Yu","family":"Gong","sequence":"first","affiliation":[{"name":"Rutger University, Piscataway, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5554-5417","authenticated-orcid":false,"given":"Miao","family":"Yin","sequence":"additional","affiliation":[{"name":"Rutgers University, Piscataway, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8204-4837","authenticated-orcid":false,"given":"Lingyi","family":"Huang","sequence":"additional","affiliation":[{"name":"Rutgers University, Piscataway, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-7311-9413","authenticated-orcid":false,"given":"Jinqi","family":"Xiao","sequence":"additional","affiliation":[{"name":"Rutgers University, Piscataway, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3020-0612","authenticated-orcid":false,"given":"Yang","family":"Sui","sequence":"additional","affiliation":[{"name":"Rutgers University, Piscataway, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0258-0155","authenticated-orcid":false,"given":"Chunhua","family":"Deng","sequence":"additional","affiliation":[{"name":"Rutgers University, Milpitas, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3978-2930","authenticated-orcid":false,"given":"Bo","family":"Yuan","sequence":"additional","affiliation":[{"name":"Rutgers University, Piscataway, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,6,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3123982"},{"key":"e_1_3_2_1_2_1","volume-title":"2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1--12","author":"Alwani Manoj","year":"2016","unstructured":"Manoj Alwani , Han Chen , Michael Ferdman , and Peter Milder . 2016 . Fused-layer CNN accelerators . In 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1--12 . Manoj Alwani, Han Chen, Michael Ferdman, and Peter Milder. 2016. Fused-layer CNN accelerators. In 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1--12."},{"key":"e_1_3_2_1_3_1","volume-title":"DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1396--1413","author":"Cai Xuyi","year":"2022","unstructured":"Xuyi Cai , Ying Wang , Xiaohan Ma , Yinhe Han , and Lei Zhang . 2022 . DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1396--1413 . Xuyi Cai, Ying Wang, Xiaohan Ma, Yinhe Han, and Lei Zhang. 2022. DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1396--1413."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001177"},{"key":"e_1_3_2_1_5_1","volume-title":"A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282","author":"Cheng Yu","year":"2017","unstructured":"Yu Cheng , Duo Wang , Pan Zhou , and Tao Zhang . 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 ( 2017 ). Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017)."},{"key":"e_1_3_2_1_6_1","volume-title":"Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems. 3123--3131.","author":"Courbariaux Matthieu","year":"2015","unstructured":"Matthieu Courbariaux , Yoshua Bengio , and Jean-Pierre David . 2015 . Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems. 3123--3131. Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems. 3123--3131."},{"key":"e_1_3_2_1_7_1","volume-title":"Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860","author":"Dai Zihang","year":"2019","unstructured":"Zihang Dai , Zhilin Yang , Yiming Yang , Jaime Carbonell , Quoc V Le , and Ruslan Salakhutdinov . 2019 . Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019). Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019)."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00090"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322258"},{"key":"e_1_3_2_1_10_1","volume-title":"2019 IEEE\/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1--6.","author":"Deng Chunhua","year":"2019","unstructured":"Chunhua Deng , Miao Yin , Xiao-Yang Liu , Xiaodong Wang , and Bo Yuan . 2019 . High-performance hardware architecture for tensor singular value decomposition . In 2019 IEEE\/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1--6. Chunhua Deng, Miao Yin, Xiao-Yang Liu, Xiaodong Wang, and Bo Yuan. 2019. High-performance hardware architecture for tensor singular value decomposition. In 2019 IEEE\/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1--6."},{"key":"e_1_3_2_1_11_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision. 293--302","author":"Dong Zhen","year":"2019","unstructured":"Zhen Dong , Zhewei Yao , Amir Gholami , Michael W Mahoney , and Kurt Keutzer . 2019 . Hawq: Hessian aware quantization of neural networks with mixed-precision . In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 293--302 . Zhen Dong, Zhewei Yao, Amir Gholami, Michael W Mahoney, and Kurt Keutzer. 2019. Hawq: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 293--302."},{"key":"e_1_3_2_1_13_1","volume-title":"2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 599--615","author":"Fan Hongxiang","year":"2022","unstructured":"Hongxiang Fan , Thomas Chau , Stylianos I Venieris , Royson Lee , Alexandros Kouris , Wayne Luk , Nicholas D Lane , and Mohamed S Abdelfattah . 2022 . Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design . In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 599--615 . Hongxiang Fan, Thomas Chau, Stylianos I Venieris, Royson Lee, Alexandros Kouris, Wayne Luk, Nicholas D Lane, and Mohamed S Abdelfattah. 2022. Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 599--615."},{"key":"e_1_3_2_1_14_1","volume-title":"Ultimate tensorization: compressing convolutional and fc layers alike. arXiv preprint arXiv:1611.03214","author":"Garipov Timur","year":"2016","unstructured":"Timur Garipov , Dmitry Podoprikhin , Alexander Novikov , and Dmitry Vetrov . 2016. Ultimate tensorization: compressing convolutional and fc layers alike. arXiv preprint arXiv:1611.03214 ( 2016 ). Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, and Dmitry Vetrov. 2016. Ultimate tensorization: compressing convolutional and fc layers alike. arXiv preprint arXiv:1611.03214 (2016)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00495"},{"key":"e_1_3_2_1_16_1","first-page":"3101","article-title":"Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition","volume":"71","author":"Gong Yu","year":"2022","unstructured":"Yu Gong , Miao Yin , Lingyi Huang , Chunhua Deng , and Bo Yuan . 2022 . Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition . IEEE Trans. Comput. 71 , 12 (2022), 3101 -- 3114 . Yu Gong, Miao Yin, Lingyi Huang, Chunhua Deng, and Bo Yuan. 2022. Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition. IEEE Trans. Comput. 71, 12 (2022), 3101--3114.","journal-title":"IEEE Trans. Comput."},{"key":"e_1_3_2_1_17_1","volume-title":"2021 IEEE International Solid-State Circuits Conference (ISSCC)","volume":"64","author":"Guo Ruiqi","year":"2021","unstructured":"Ruiqi Guo , Zhiheng Yue , Xin Si , Te Hu , Hao Li , Limei Tang , Yabing Wang , Leibo Liu , Meng-Fan Chang , Qiang Li , 2021 . 15.4 a 5.99-to-691.1 tops\/w tensor-train in-memory-computing processor using bit-level-sparsity-based optimization and variable-precision quantization . In 2021 IEEE International Solid-State Circuits Conference (ISSCC) , Vol. 64 . IEEE, 242--244. Ruiqi Guo, Zhiheng Yue, Xin Si, Te Hu, Hao Li, Limei Tang, Yabing Wang, Leibo Liu, Meng-Fan Chang, Qiang Li, et al. 2021. 15.4 a 5.99-to-691.1 tops\/w tensor-train in-memory-computing processor using bit-level-sparsity-based optimization and variable-precision quantization. In 2021 IEEE International Solid-State Circuits Conference (ISSCC), Vol. 64. IEEE, 242--244."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480127"},{"key":"e_1_3_2_1_19_1","volume-title":"Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149","author":"Han Song","year":"2015","unstructured":"Song Han , Huizi Mao , and William J Dally . 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 ( 2015 ). Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)."},{"key":"e_1_3_2_1_20_1","first-page":"1135","article-title":"Learning both weights and connections for efficient neural network","volume":"28","author":"Han Song","year":"2015","unstructured":"Song Han , Jeff Pool , John Tran , and William Dally . 2015 . Learning both weights and connections for efficient neural network . Advances in Neural Information Processing Systems 28 (2015), 1135 -- 1143 . Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems 28 (2015), 1135--1143.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_21_1","volume-title":"Proceedings of the 49th Annual International Symposium on Computer Architecture. 522--535","author":"Hanson Edward","year":"2022","unstructured":"Edward Hanson , Shiyu Li , Hai'Helen' Li , and Yiran Chen . 2022 . Cascading structured pruning: enabling high data reuse for sparse DNN accelerators . In Proceedings of the 49th Annual International Symposium on Computer Architecture. 522--535 . Edward Hanson, Shiyu Li, Hai'Helen' Li, and Yiran Chen. 2022. Cascading structured pruning: enabling high data reuse for sparse DNN accelerators. In Proceedings of the 49th Annual International Symposium on Computer Architecture. 522--535."},{"key":"e_1_3_2_1_22_1","volume-title":"Cambricon-P: A Bitflow Architecture for Arbitrary Precision Computing. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 57--72","author":"Hao Yifan","year":"2022","unstructured":"Yifan Hao , Yongwei Zhao , Chenxiao Liu , Zidong Du , Shuyao Cheng , Xiaqing Li , Xing Hu , Qi Guo , Zhiwei Xu , and Tianshi Chen . 2022 . Cambricon-P: A Bitflow Architecture for Arbitrary Precision Computing. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 57--72 . Yifan Hao, Yongwei Zhao, Chenxiao Liu, Zidong Du, Shuyao Cheng, Xiaqing Li, Xing Hu, Qi Guo, Zhiwei Xu, and Tianshi Chen. 2022. Cambricon-P: A Bitflow Architecture for Arbitrary Precision Computing. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 57--72."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.155"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.436"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00083"},{"key":"e_1_3_2_1_26_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2704--2713","author":"Jacob Benoit","year":"2018","unstructured":"Benoit Jacob , Skirmantas Kligys , Bo Chen , Menglong Zhu , Matthew Tang , Andrew Howard , Hartwig Adam , and Dmitry Kalenichenko . 2018 . Quantization and training of neural networks for efficient integer-arithmetic-only inference . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2704--2713 . Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2704--2713."},{"key":"e_1_3_2_1_27_1","volume-title":"SysML Conference","volume":"120","author":"Lee Ching-En","year":"2018","unstructured":"Ching-En Lee , Yakun Sophia Shao , Jie-Fang Zhang , Angshuman Parashar , Joel Emer , Stephen W Keckler , and Zhengya Zhang . 2018 . Stitch-x: An accelerator architecture for exploiting unstructured sparsity in deep neural networks . In SysML Conference , Vol. 120 . Ching-En Lee, Yakun Sophia Shao, Jie-Fang Zhang, Angshuman Parashar, Joel Emer, Stephen W Keckler, and Zhengya Zhang. 2018. Stitch-x: An accelerator architecture for exploiting unstructured sparsity in deep neural networks. In SysML Conference, Vol. 120."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3470496.3527423"},{"key":"e_1_3_2_1_29_1","volume-title":"Algorithm and Hardware Co-Design Co-Optimization Framework for LSTM Accelerator using Fully Decomposed Tensor Train. DAC (Work-in-Progress)","author":"Liu Mingshuo","year":"2021","unstructured":"Mingshuo Liu , Miao Yin , Kevin Han , Shiyi Luo , Mingju Liu , Ronald F DeMara , Bo Yuan , and Yu Bai . 2021. Algorithm and Hardware Co-Design Co-Optimization Framework for LSTM Accelerator using Fully Decomposed Tensor Train. DAC (Work-in-Progress) ( 2021 ). Mingshuo Liu, Miao Yin, Kevin Han, Shiyi Luo, Mingju Liu, Ronald F DeMara, Bo Yuan, and Yu Bai. 2021. Algorithm and Hardware Co-Design Co-Optimization Framework for LSTM Accelerator using Fully Decomposed Tensor Train. DAC (Work-in-Progress) (2021)."},{"key":"e_1_3_2_1_30_1","volume-title":"2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 573--586","author":"Liu Zhi-Gang","year":"2022","unstructured":"Zhi-Gang Liu , Paul N Whatmough , Yuhao Zhu , and Matthew Mattina . 2022 . S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration . In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 573--586 . Zhi-Gang Liu, Paul N Whatmough, Yuhao Zhu, and Matthew Mattina. 2022. S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 573--586."},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the 49th Annual International Symposium on Computer Architecture. 993--1011","author":"Mudigere Dheevatsa","year":"2022","unstructured":"Dheevatsa Mudigere , Yuchen Hao , Jianyu Huang , Zhihao Jia , Andrew Tulloch , Srinivas Sridharan , Xing Liu , Mustafa Ozdal , Jade Nie , Jongsoo Park , 2022 . Software-hardware co-design for fast and scalable training of deep learning recommendation models . In Proceedings of the 49th Annual International Symposium on Computer Architecture. 993--1011 . Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, et al. 2022. Software-hardware co-design for fast and scalable training of deep learning recommendation models. In Proceedings of the 49th Annual International Symposium on Computer Architecture. 993--1011."},{"key":"e_1_3_2_1_32_1","volume-title":"Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G Azzolini, et al.","author":"Naumov Maxim","year":"2019","unstructured":"Maxim Naumov , Dheevatsa Mudigere , Hao-Jun Michael Shi , Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G Azzolini, et al. 2019 . Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091 (2019). Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G Azzolini, et al. 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091 (2019)."},{"key":"e_1_3_2_1_33_1","first-page":"442","article-title":"Tensorizing neural networks","volume":"28","author":"Novikov Alexander","year":"2015","unstructured":"Alexander Novikov , Dmitrii Podoprikhin , Anton Osokin , and Dmitry P Vetrov . 2015 . Tensorizing neural networks . Advances in Neural Information Processing Systems 28 (2015), 442 -- 450 . Alexander Novikov, Dmitrii Podoprikhin, Anton Osokin, and Dmitry P Vetrov. 2015. Tensorizing neural networks. Advances in Neural Information Processing Systems 28 (2015), 442--450.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33014683"},{"key":"e_1_3_2_1_35_1","volume-title":"CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness. AAAI","author":"Phan Huy","year":"2023","unstructured":"Huy Phan , Miao Yin , Yang Sui , Bo Yuan , and Saman Zonouz . 2023 . CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness. AAAI (2023). Huy Phan, Miao Yin, Yang Sui, Bo Yuan, and Saman Zonouz. 2023. CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness. AAAI (2023)."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00015"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46493-0_32"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3093336.3037746"},{"key":"e_1_3_2_1_39_1","volume-title":"Exploring extreme parameter compression for pre-trained language models. arXiv preprint arXiv:2205.10036","author":"Ren Yuxin","year":"2022","unstructured":"Yuxin Ren , Benyou Wang , Lifeng Shang , Xin Jiang , and Qun Liu . 2022. Exploring extreme parameter compression for pre-trained language models. arXiv preprint arXiv:2205.10036 ( 2022 ). Yuxin Ren, Benyou Wang, Lifeng Shang, Xin Jiang, and Qun Liu. 2022. Exploring extreme parameter compression for pre-trained language models. arXiv preprint arXiv:2205.10036 (2022)."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00017"},{"key":"e_1_3_2_1_41_1","volume-title":"Griffin: Rethinking Sparse Optimization for Deep Learning Architectures. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 861--875","author":"Shin Jong Hoon","year":"2022","unstructured":"Jong Hoon Shin , Ali Shafiee , Ardavan Pedram , Hamzah Abdel-Aziz , Ling Li , and Joseph Hassoun . 2022 . Griffin: Rethinking Sparse Optimization for Deep Learning Architectures. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 861--875 . Jong Hoon Shin, Ali Shafiee, Ardavan Pedram, Hamzah Abdel-Aziz, Ling Li, and Joseph Hassoun. 2022. Griffin: Rethinking Sparse Optimization for Deep Learning Architectures. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 861--875."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2017.55"},{"key":"e_1_3_2_1_43_1","volume-title":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 66--77","author":"Song Mingcong","year":"2018","unstructured":"Mingcong Song , Jiaqi Zhang , Huixiang Chen , and Tao Li . 2018 . Towards efficient microarchitectural design for accelerating unsupervised gan-based deep learning . In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 66--77 . Mingcong Song, Jiaqi Zhang, Huixiang Chen, and Tao Li. 2018. Towards efficient microarchitectural design for accelerating unsupervised gan-based deep learning. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 66--77."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00018"},{"key":"e_1_3_2_1_45_1","first-page":"24604","article-title":"Chip: Channel independence-based pruning for compact neural networks","volume":"34","author":"Sui Yang","year":"2021","unstructured":"Yang Sui , Miao Yin , Yi Xie , Huy Phan , Saman Aliari Zonouz , and Bo Yuan . 2021 . Chip: Channel independence-based pruning for compact neural networks . Advances in Neural Information Processing Systems 34 (2021), 24604 -- 24616 . Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Aliari Zonouz, and Bo Yuan. 2021. Chip: Channel independence-based pruning for compact neural networks. Advances in Neural Information Processing Systems 34 (2021), 24604--24616.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_46_1","volume-title":"Proceedings of the 44th Annual International Symposium on Computer Architecture. 13--26","author":"Venkataramani Swagath","year":"2017","unstructured":"Swagath Venkataramani , Ashish Ranjan , Subarno Banerjee , Dipankar Das , Sasikanth Avancha , Ashok Jagannathan , Ajaya Durg , Dheemanth Nagaraj , Bharat Kaul , Pradeep Dubey , 2017 . Scaledeep: A scalable compute architecture for learning and evaluating deep networks . In Proceedings of the 44th Annual International Symposium on Computer Architecture. 13--26 . Swagath Venkataramani, Ashish Ranjan, Subarno Banerjee, Dipankar Das, Sasikanth Avancha, Ashok Jagannathan, Ajaya Durg, Dheemanth Nagaraj, Bharat Kaul, Pradeep Dubey, et al. 2017. Scaledeep: A scalable compute architecture for learning and evaluating deep networks. In Proceedings of the 44th Annual International Symposium on Computer Architecture. 13--26."},{"key":"e_1_3_2_1_47_1","volume-title":"2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 97--110","author":"Wang Hanrui","year":"2021","unstructured":"Hanrui Wang , Zhekai Zhang , and Song Han . 2021 . Spatten: Efficient sparse attention architecture with cascade token and head pruning . In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 97--110 . Hanrui Wang, Zhekai Zhang, and Song Han. 2021. Spatten: Efficient sparse attention architecture with cascade token and head pruning. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 97--110."},{"key":"e_1_3_2_1_48_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9329--9338","author":"Wang Wenqi","year":"2018","unstructured":"Wenqi Wang , Yifan Sun , Brian Eriksson , Wenlin Wang , and Vaneet Aggarwal . 2018 . Wide compression: Tensor ring nets . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9329--9338 . Wenqi Wang, Yifan Sun, Brian Eriksson, Wenlin Wang, and Vaneet Aggarwal. 2018. Wide compression: Tensor ring nets. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9329--9338."},{"key":"e_1_3_2_1_49_1","volume-title":"Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 260--273","author":"Xiang Lizhi","year":"2023","unstructured":"Lizhi Xiang , Miao Yin , Chengming Zhang , Aravind Sukumaran-Rajam , P Sadayappan , Bo Yuan , and Dingwen Tao . 2023 . TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition . In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 260--273 . Lizhi Xiang, Miao Yin, Chengming Zhang, Aravind Sukumaran-Rajam, P Sadayappan, Bo Yuan, and Dingwen Tao. 2023. TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 260--273."},{"key":"e_1_3_2_1_50_1","volume-title":"d.]","author":"Xiao Jinqi","unstructured":"Jinqi Xiao , Chengming Zhang , Yu Gong , Miao Yin , Yang Sui , Lizhi Xiang , Dingwen Tao , and Bo Yuan . [n. d.] . HALOC : Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks. AAAI ( [n. d.]). Jinqi Xiao, Chengming Zhang, Yu Gong, Miao Yin, Yang Sui, Lizhi Xiang, Dingwen Tao, and Bo Yuan. [n. d.]. HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks. AAAI ([n. d.])."},{"key":"e_1_3_2_1_51_1","volume-title":"Tensor-Train Recurrent Neural Networks for Video Classification. In International Conference on Machine Learning. 3891--3900","author":"Yang Yinchong","year":"2017","unstructured":"Yinchong Yang , Denis Krompass , and Volker Tresp . 2017 . Tensor-Train Recurrent Neural Networks for Video Classification. In International Conference on Machine Learning. 3891--3900 . Yinchong Yang, Denis Krompass, and Volker Tresp. 2017. Tensor-Train Recurrent Neural Networks for Video Classification. In International Conference on Machine Learning. 3891--3900."},{"key":"e_1_3_2_1_52_1","volume-title":"Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 744--762","author":"Yazdanbakhsh Amir","year":"2022","unstructured":"Amir Yazdanbakhsh , Ashkan Moradifirouzabadi , Zheng Li , and Mingu Kang . 2022 . Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 744--762 . Amir Yazdanbakhsh, Ashkan Moradifirouzabadi, Zheng Li, and Mingu Kang. 2022. Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation. In 2022 55th IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, 744--762."},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00977"},{"key":"e_1_3_2_1_54_1","first-page":"448","article-title":"Tt-rec: Tensor train compression for deep learning recommendation models","volume":"3","author":"Yin Chunxing","year":"2021","unstructured":"Chunxing Yin , Bilge Acun , Carole-Jean Wu , and Xing Liu . 2021 . Tt-rec: Tensor train compression for deep learning recommendation models . Proceedings of Machine Learning and Systems 3 (2021), 448 -- 462 . Chunxing Yin, Bilge Acun, Carole-Jean Wu, and Xing Liu. 2021. Tt-rec: Tensor train compression for deep learning recommendation models. Proceedings of Machine Learning and Systems 3 (2021), 448--462.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_1_55_1","volume-title":"Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2327--2335","author":"Yin Chunxing","year":"2022","unstructured":"Chunxing Yin , Da Zheng , Israt Nisa , Christos Faloutsos , George Karypis , and Richard Vuduc . 2022 . Nimble GNN Embedding with Tensor-Train Decomposition . In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2327--2335 . Chunxing Yin, Da Zheng, Israt Nisa, Christos Faloutsos, George Karypis, and Richard Vuduc. 2022. Nimble GNN Embedding with Tensor-Train Decomposition. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2327--2335."},{"key":"e_1_3_2_1_56_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"36","author":"Yin Miao","year":"2022","unstructured":"Miao Yin , Huy Phan , Xiao Zang , Siyu Liao , and Bo Yuan . 2022 . Batude: Budgetaware neural network compression based on tucker decomposition . In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 36 . 8874--8882. Miao Yin, Huy Phan, Xiao Zang, Siyu Liao, and Bo Yuan. 2022. Batude: Budgetaware neural network compression based on tucker decomposition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 8874--8882."},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01053"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01198"},{"key":"e_1_3_2_1_59_1","volume-title":"GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer. AAAI","author":"Yin Miao","year":"2023","unstructured":"Miao Yin , Burak Uzkent , Yilin Shen , Hongxia Jin , and Bo Yuan . 2023 . GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer. AAAI (2023). Miao Yin, Burak Uzkent, Yilin Shen, Hongxia Jin, and Bo Yuan. 2023. GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer. AAAI (2023)."},{"key":"e_1_3_2_1_60_1","volume-title":"Mokey: enabling narrow fixed-point inference for out-of-the-box floatingpoint transformer models. arXiv preprint arXiv:2203.12758","author":"Zadeh Ali Hadi","year":"2022","unstructured":"Ali Hadi Zadeh , Mostafa Mahmoud , Ameer Abdelhadi , and Andreas Moshovos . 2022. Mokey: enabling narrow fixed-point inference for out-of-the-box floatingpoint transformer models. arXiv preprint arXiv:2203.12758 ( 2022 ). Ali Hadi Zadeh, Mostafa Mahmoud, Ameer Abdelhadi, and Andreas Moshovos. 2022. Mokey: enabling narrow fixed-point inference for out-of-the-box floatingpoint transformer models. arXiv preprint arXiv:2203.12758 (2022)."},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783723"},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-018-09573-7"}],"event":{"name":"ISCA '23: 50th Annual International Symposium on Computer Architecture","location":"Orlando FL USA","acronym":"ISCA '23","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE"]},"container-title":["Proceedings of the 50th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3579371.3589103","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:46:40Z","timestamp":1750178800000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3579371.3589103"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,17]]},"references-count":62,"alternative-id":["10.1145\/3579371.3589103","10.1145\/3579371"],"URL":"https:\/\/doi.org\/10.1145\/3579371.3589103","relation":{},"subject":[],"published":{"date-parts":[[2023,6,17]]},"assertion":[{"value":"2023-06-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}