{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T16:38:25Z","timestamp":1778258305845,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":56,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,10,21]],"date-time":"2023-10-21T00:00:00Z","timestamp":1697846400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,10,21]]},"DOI":"10.1145\/3583780.3614904","type":"proceedings-article","created":{"date-parts":[[2023,10,21]],"date-time":"2023-10-21T07:45:26Z","timestamp":1697874326000},"page":"276-285","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Hadamard Adapter: An Extreme Parameter-Efficient Adapter Tuning Method for Pre-trained Language Models"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4381-486X","authenticated-orcid":false,"given":"Yuyan","family":"Chen","sequence":"first","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5821-7267","authenticated-orcid":false,"given":"Qiang","family":"Fu","sequence":"additional","affiliation":[{"name":"Microsoft, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5653-1626","authenticated-orcid":false,"given":"Ge","family":"Fan","sequence":"additional","affiliation":[{"name":"Tencent, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7625-0650","authenticated-orcid":false,"given":"Lun","family":"Du","sequence":"additional","affiliation":[{"name":"Microsoft, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8496-033X","authenticated-orcid":false,"given":"Jian-Guang","family":"Lou","sequence":"additional","affiliation":[{"name":"Microsoft, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0360-6089","authenticated-orcid":false,"given":"Shi","family":"Han","sequence":"additional","affiliation":[{"name":"Microsoft, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9230-2799","authenticated-orcid":false,"given":"Dongmei","family":"Zhang","sequence":"additional","affiliation":[{"name":"Microsoft, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2355-288X","authenticated-orcid":false,"given":"Zhixu","family":"Li","sequence":"additional","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8403-9591","authenticated-orcid":false,"given":"Yanghua","family":"Xiao","sequence":"additional","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University &amp; Fudan-Aishu Cognitive Intelligence Joint Research Center, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,10,21]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Anna Korhonen, and Ivan Vuli?.","author":"Ansell Alan","year":"2021","unstructured":"Alan Ansell , Edoardo Maria Ponti , Anna Korhonen, and Ivan Vuli?. 2021 . Composable Sparse Fine-Tuning for Cross-Lingual Transfer . https:\/\/doi.org\/10.48550\/ARXIV.2110.07560 10.48550\/ARXIV.2110.07560 Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, and Ivan Vuli?. 2021. Composable Sparse Fine-Tuning for Cross-Lingual Transfer. https:\/\/doi.org\/10.48550\/ARXIV.2110.07560"},{"key":"#cr-split#-e_1_3_2_1_2_1.1","unstructured":"Lei Jimmy Ba and Rich Caruana. 2013. Do Deep Nets Really Need to be Deep? https:\/\/doi.org\/10.48550\/ARXIV.1312.6184 10.48550\/ARXIV.1312.6184"},{"key":"#cr-split#-e_1_3_2_1_2_1.2","unstructured":"Lei Jimmy Ba and Rich Caruana. 2013. Do Deep Nets Really Need to be Deep? https:\/\/doi.org\/10.48550\/ARXIV.1312.6184"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-short.1"},{"key":"#cr-split#-e_1_3_2_1_4_1.1","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. https:\/\/doi.org\/10.48550\/ARXIV.2005.14165 10.48550\/ARXIV.2005.14165"},{"key":"#cr-split#-e_1_3_2_1_4_1.2","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. https:\/\/doi.org\/10.48550\/ARXIV.2005.14165"},{"key":"#cr-split#-e_1_3_2_1_5_1.1","unstructured":"Yu Cheng Duo Wang Pan Zhou and Tao Zhang. 2017. A Survey of Model Compression and Acceleration for Deep Neural Networks. https:\/\/doi.org\/10.48550\/ARXIV.1710.09282 10.48550\/ARXIV.1710.09282"},{"key":"#cr-split#-e_1_3_2_1_5_1.2","unstructured":"Yu Cheng Duo Wang Pan Zhou and Tao Zhang. 2017. A Survey of Model Compression and Acceleration for Deep Neural Networks. https:\/\/doi.org\/10.48550\/ARXIV.1710.09282"},{"key":"e_1_3_2_1_6_1","volume-title":"Manning","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark , Minh-Thang Luong , Quoc V. Le , and Christopher D . Manning . 2020 . ELECTRA : Pre-training Text Encoders as Discriminators Rather Than Generators . https:\/\/doi.org\/10.48550\/ARXIV.2003.10555 10.48550\/ARXIV.2003.10555 Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. https:\/\/doi.org\/10.48550\/ARXIV.2003.10555"},{"key":"e_1_3_2_1_7_1","volume-title":"Large Scale Distributed Deep Networks. Advances in neural information processing systems (10","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean , G.s Corrado, Rajat Monga , Kai Chen , Matthieu Devin , Quoc Le , Mark Mao , Aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , and Andrew Ng. 2012. Large Scale Distributed Deep Networks. Advances in neural information processing systems (10 2012 ). Jeffrey Dean, G.s Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc Le, Mark Mao, Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Ng. 2012. Large Scale Distributed Deep Networks. Advances in neural information processing systems (10 2012)."},{"key":"e_1_3_2_1_8_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https:\/\/doi.org\/10.48550\/ARXIV.1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https:\/\/doi.org\/10.48550\/ARXIV.1810.04805 10.48550\/ARXIV.1810.04805 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https:\/\/doi.org\/10.48550\/ARXIV.1810.04805"},{"key":"#cr-split#-e_1_3_2_1_9_1.1","unstructured":"Yunchao Gong Liu Liu Ming Yang and Lubomir Bourdev. 2014. Compressing Deep Convolutional Networks using Vector Quantization. https:\/\/doi.org\/10.48550\/ARXIV.1412.6115 10.48550\/ARXIV.1412.6115"},{"key":"#cr-split#-e_1_3_2_1_9_1.2","unstructured":"Yunchao Gong Liu Liu Ming Yang and Lubomir Bourdev. 2014. Compressing Deep Convolutional Networks using Vector Quantization. https:\/\/doi.org\/10.48550\/ARXIV.1412.6115"},{"key":"e_1_3_2_1_10_1","volume-title":"PPT: Pre-trained Prompt Tuning for Few-shot Learning. https:\/\/doi.org\/10.48550\/ARXIV.2109.04332","author":"Gu Yuxian","year":"2021","unstructured":"Yuxian Gu , Xu Han , Zhiyuan Liu , and Minlie Huang . 2021 . PPT: Pre-trained Prompt Tuning for Few-shot Learning. https:\/\/doi.org\/10.48550\/ARXIV.2109.04332 10.48550\/ARXIV.2109.04332 Yuxian Gu, Xu Han, Zhiyuan Liu, and Minlie Huang. 2021. PPT: Pre-trained Prompt Tuning for Few-shot Learning. https:\/\/doi.org\/10.48550\/ARXIV.2109.04332"},{"key":"e_1_3_2_1_11_1","volume-title":"Dally","author":"Han Song","year":"2015","unstructured":"Song Han , Jeff Pool , John Tran , and William J . Dally . 2015 . Learning both Weights and Connections for Efficient Neural Networks . https:\/\/doi.org\/10.48550\/ARXIV.1506.02626 10.48550\/ARXIV.1506.02626 Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both Weights and Connections for Efficient Neural Networks. https:\/\/doi.org\/10.48550\/ARXIV.1506.02626"},{"key":"e_1_3_2_1_12_1","volume-title":"Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366","author":"He Junxian","year":"2021","unstructured":"Junxian He , Chunting Zhou , Xuezhe Ma , Taylor Berg-Kirkpatrick , and Graham Neubig . 2021. Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366 ( 2021 ). Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, and Graham Neubig. 2021. Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366 (2021)."},{"key":"#cr-split#-e_1_3_2_1_13_1.1","unstructured":"Pengcheng He Xiaodong Liu Jianfeng Gao and Weizhu Chen. 2020. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. https:\/\/doi.org\/10.48550\/ARXIV.2006.03654 10.48550\/ARXIV.2006.03654"},{"key":"#cr-split#-e_1_3_2_1_13_1.2","unstructured":"Pengcheng He Xiaodong Liu Jianfeng Gao and Weizhu Chen. 2020. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. https:\/\/doi.org\/10.48550\/ARXIV.2006.03654"},{"key":"#cr-split#-e_1_3_2_1_14_1.1","unstructured":"Geoffrey Hinton Oriol Vinyals and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. https:\/\/doi.org\/10.48550\/ARXIV.1503.02531 10.48550\/ARXIV.1503.02531"},{"key":"#cr-split#-e_1_3_2_1_14_1.2","unstructured":"Geoffrey Hinton Oriol Vinyals and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. https:\/\/doi.org\/10.48550\/ARXIV.1503.02531"},{"key":"e_1_3_2_1_15_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15","volume":"2799","author":"Houlsby Neil","year":"2019","unstructured":"Neil Houlsby , Andrei Giurgiu , Stanislaw Jastrzebski , Bruna Morrone , Quentin de Laroussilhe , Andrea Gesmundo , Mona Attariyan , and Sylvain Gelly . 2019 . Parameter-Efficient Transfer Learning for NLP . In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research , Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 2790-- 2799 . http:\/\/proceedings.mlr.press\/v97\/houlsby19a.html Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-Efficient Transfer Learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 2790--2799. http:\/\/proceedings.mlr.press\/v97\/houlsby19a.html"},{"key":"#cr-split#-e_1_3_2_1_16_1.1","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. https:\/\/doi.org\/10.48550\/ARXIV.2106.09685 10.48550\/ARXIV.2106.09685"},{"key":"#cr-split#-e_1_3_2_1_16_1.2","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. https:\/\/doi.org\/10.48550\/ARXIV.2106.09685"},{"key":"e_1_3_2_1_17_1","volume-title":"Fine-tuning can distort pretrained features and underperform out-of-distribution. arXiv preprint arXiv:2202.10054","author":"Kumar Ananya","year":"2022","unstructured":"Ananya Kumar , Aditi Raghunathan , Robbie Jones , Tengyu Ma , and Percy Liang . 2022. Fine-tuning can distort pretrained features and underperform out-of-distribution. arXiv preprint arXiv:2202.10054 ( 2022 ). Ananya Kumar, Aditi Raghunathan, Robbie Jones, Tengyu Ma, and Percy Liang. 2022. Fine-tuning can distort pretrained features and underperform out-of-distribution. arXiv preprint arXiv:2202.10054 (2022)."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.243"},{"key":"e_1_3_2_1_19_1","volume-title":"BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. https:\/\/doi.org\/10.48550\/ARXIV.1910.13461","author":"Lewis Mike","year":"2019","unstructured":"Mike Lewis , Yinhan Liu , Naman Goyal , Marjan Ghazvininejad , Abdelrahman Mohamed , Omer Levy , Ves Stoyanov , and Luke Zettlemoyer . 2019 . BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. https:\/\/doi.org\/10.48550\/ARXIV.1910.13461 10.48550\/ARXIV.1910.13461 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. https:\/\/doi.org\/10.48550\/ARXIV.1910.13461"},{"key":"e_1_3_2_1_20_1","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Li Xiang Lisa","year":"1865","unstructured":"Xiang Lisa Li and Percy Liang . 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) . Association for Computational Linguistics , Online , 4582--4597. https:\/\/doi.org\/10. 1865 3\/v1\/2021.acl-long.353 10.18653\/v1 Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4582--4597. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.353"},{"key":"e_1_3_2_1_21_1","first-page":"41","volume-title":"Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning. In Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Lin Zhaojiang","year":"2020","unstructured":"Zhaojiang Lin , Andrea Madotto , and Pascale Fung . 2020 . Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning. In Findings of the Association for Computational Linguistics: EMNLP 2020 . Association for Computational Linguistics, Online, 441--459. https:\/\/doi.org\/10. 18653\/v1\/2020.findings-emnlp. 41 10.18653\/v1 Zhaojiang Lin, Andrea Madotto, and Pascale Fung. 2020. Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 441--459. https:\/\/doi.org\/10.18653\/v1\/2020.findings-emnlp.41"},{"key":"e_1_3_2_1_22_1","first-page":"1950","article-title":"Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning","volume":"35","author":"Liu Haokun","year":"2022","unstructured":"Haokun Liu , Derek Tam , Mohammed Muqeeth , Jay Mohta , Tenghao Huang , Mohit Bansal , and Colin A Raffel . 2022 b. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning . Advances in Neural Information Processing Systems , Vol. 35 (2022), 1950 -- 1965 . Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin A Raffel. 2022b. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems, Vol. 35 (2022), 1950--1965.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_23_1","volume-title":"Zhi-Yuan Xie, Zhong-Yi Lu, and Ji-Rong Wen.","author":"Liu Peiyu","year":"2021","unstructured":"Peiyu Liu , Ze-Feng Gao , Wayne Xin Zhao , Zhi-Yuan Xie, Zhong-Yi Lu, and Ji-Rong Wen. 2021 a. Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 5388--5398. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.418 10.18653\/v1 Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Zhi-Yuan Xie, Zhong-Yi Lu, and Ji-Rong Wen. 2021a. Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 5388--5398. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.418"},{"key":"e_1_3_2_1_24_1","unstructured":"Xiao Liu Yanan Zheng Zhengxiao Du Ming Ding Yujie Qian Zhilin Yang and Jie Tang. 2021b. GPT Understands Too. https:\/\/doi.org\/10.48550\/ARXIV.2103.10385    10.48550\/ARXIV.2103.10385\nXiao Liu Yanan Zheng Zhengxiao Du Ming Ding Yujie Qian Zhilin Yang and Jie Tang. 2021b. GPT Understands Too. https:\/\/doi.org\/10.48550\/ARXIV.2103.10385"},{"key":"e_1_3_2_1_25_1","unstructured":"Yitao Liu Chenxin An and Xipeng Qiu. 2022a. $mathcalY$-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning. https:\/\/doi.org\/10.48550\/ARXIV.2202.09817    10.48550\/ARXIV.2202.09817\nYitao Liu Chenxin An and Xipeng Qiu. 2022a. $mathcalY$-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning. https:\/\/doi.org\/10.48550\/ARXIV.2202.09817"},{"key":"#cr-split#-e_1_3_2_1_26_1.1","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. https:\/\/doi.org\/10.48550\/ARXIV.1907.11692 10.48550\/ARXIV.1907.11692"},{"key":"#cr-split#-e_1_3_2_1_26_1.2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. https:\/\/doi.org\/10.48550\/ARXIV.1907.11692"},{"key":"#cr-split#-e_1_3_2_1_27_1.1","unstructured":"Rabeeh Karimi Mahabadi Sebastian Ruder Mostafa Dehghani and James Henderson. 2021. Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks. https:\/\/doi.org\/10.48550\/ARXIV.2106.04489 10.48550\/ARXIV.2106.04489"},{"key":"#cr-split#-e_1_3_2_1_27_1.2","doi-asserted-by":"crossref","unstructured":"Rabeeh Karimi Mahabadi Sebastian Ruder Mostafa Dehghani and James Henderson. 2021. Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks. https:\/\/doi.org\/10.48550\/ARXIV.2106.04489","DOI":"10.18653\/v1\/2021.acl-long.47"},{"key":"e_1_3_2_1_28_1","volume-title":"Unipelt: A unified framework for parameter-efficient language model tuning. arXiv preprint arXiv:2110.07577","author":"Mao Yuning","year":"2021","unstructured":"Yuning Mao , Lambert Mathias , Rui Hou , Amjad Almahairi , Hao Ma , Jiawei Han , Wen-tau Yih, and Madian Khabsa . 2021 . Unipelt: A unified framework for parameter-efficient language model tuning. arXiv preprint arXiv:2110.07577 (2021). Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, and Madian Khabsa. 2021. Unipelt: A unified framework for parameter-efficient language model tuning. arXiv preprint arXiv:2110.07577 (2021)."},{"key":"#cr-split#-e_1_3_2_1_29_1.1","unstructured":"Jonas Pfeiffer Aishwarya Kamath Andreas R\u00fcckl\u00e9 Kyunghyun Cho and Iryna Gurevych. 2020. AdapterFusion: Non-Destructive Task Composition for Transfer Learning. (2020). https:\/\/doi.org\/10.48550\/ARXIV.2005.00247 10.48550\/ARXIV.2005.00247"},{"key":"#cr-split#-e_1_3_2_1_29_1.2","doi-asserted-by":"crossref","unstructured":"Jonas Pfeiffer Aishwarya Kamath Andreas R\u00fcckl\u00e9 Kyunghyun Cho and Iryna Gurevych. 2020. AdapterFusion: Non-Destructive Task Composition for Transfer Learning. (2020). https:\/\/doi.org\/10.48550\/ARXIV.2005.00247","DOI":"10.18653\/v1\/2021.eacl-main.39"},{"key":"e_1_3_2_1_30_1","volume-title":"Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models. arXiv preprint arXiv:2211.08682","author":"Qi Wang","year":"2022","unstructured":"Wang Qi , Yu-Ping Ruan , Yuan Zuo , and Taihao Li. 2022. Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models. arXiv preprint arXiv:2211.08682 ( 2022 ). Wang Qi, Yu-Ping Ruan, Yuan Zuo, and Taihao Li. 2022. Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models. arXiv preprint arXiv:2211.08682 (2022)."},{"key":"e_1_3_2_1_31_1","volume-title":"Liu","author":"Raffel Colin","year":"2019","unstructured":"Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , and Peter J . Liu . 2019 . Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer . https:\/\/doi.org\/10.48550\/ARXIV.1910.10683 10.48550\/ARXIV.1910.10683 Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. https:\/\/doi.org\/10.48550\/ARXIV.1910.10683"},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 7930--7946","author":"R\u00fcckl\u00e9 Andreas","year":"2021","unstructured":"Andreas R\u00fcckl\u00e9 , Gregor Geigle , Max Glockner , Tilman Beck , Jonas Pfeiffer , Nils Reimers , and Iryna Gurevych . 2021 . AdapterDrop: On the Efficiency of Adapters in Transformers . In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 7930--7946 . https:\/\/doi.org\/10.18653\/v1\/2021.emnlp-main.626 10.18653\/v1 Andreas R\u00fcckl\u00e9, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, and Iryna Gurevych. 2021. AdapterDrop: On the Efficiency of Adapters in Transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 7930--7946. https:\/\/doi.org\/10.18653\/v1\/2021.emnlp-main.626"},{"key":"#cr-split#-e_1_3_2_1_33_1.1","doi-asserted-by":"crossref","unstructured":"Suraj Srinivas and R. Venkatesh Babu. 2015. Data-free parameter pruning for Deep Neural Networks. https:\/\/doi.org\/10.48550\/ARXIV.1507.06149 10.48550\/ARXIV.1507.06149","DOI":"10.5244\/C.29.31"},{"key":"#cr-split#-e_1_3_2_1_33_1.2","doi-asserted-by":"crossref","unstructured":"Suraj Srinivas and R. Venkatesh Babu. 2015. Data-free parameter pruning for Deep Neural Networks. https:\/\/doi.org\/10.48550\/ARXIV.1507.06149","DOI":"10.5244\/C.29.31"},{"key":"#cr-split#-e_1_3_2_1_34_1.1","unstructured":"Tianxiang Sun Yunfan Shao Hong Qian Xuanjing Huang and Xipeng Qiu. 2022. Black-Box Tuning for Language-Model-as-a-Service. https:\/\/doi.org\/10.48550\/ARXIV.2201.03514 10.48550\/ARXIV.2201.03514"},{"key":"#cr-split#-e_1_3_2_1_34_1.2","unstructured":"Tianxiang Sun Yunfan Shao Hong Qian Xuanjing Huang and Xipeng Qiu. 2022. Black-Box Tuning for Language-Model-as-a-Service. https:\/\/doi.org\/10.48550\/ARXIV.2201.03514"},{"key":"#cr-split#-e_1_3_2_1_35_1.1","unstructured":"Cheng Tai Tong Xiao Yi Zhang Xiaogang Wang and Weinan E. 2015. Convolutional neural networks with low-rank regularization. https:\/\/doi.org\/10.48550\/ARXIV.1511.06067 10.48550\/ARXIV.1511.06067"},{"key":"#cr-split#-e_1_3_2_1_35_1.2","unstructured":"Cheng Tai Tong Xiao Yi Zhang Xiaogang Wang and Weinan E. 2015. Convolutional neural networks with low-rank regularization. https:\/\/doi.org\/10.48550\/ARXIV.1511.06067"},{"key":"#cr-split#-e_1_3_2_1_36_1.1","unstructured":"Karen Ullrich Edward Meeds and Max Welling. 2017. Soft Weight-Sharing for Neural Network Compression. https:\/\/doi.org\/10.48550\/ARXIV.1702.04008 10.48550\/ARXIV.1702.04008"},{"key":"#cr-split#-e_1_3_2_1_36_1.2","unstructured":"Karen Ullrich Edward Meeds and Max Welling. 2017. Soft Weight-Sharing for Neural Network Compression. https:\/\/doi.org\/10.48550\/ARXIV.1702.04008"},{"key":"#cr-split#-e_1_3_2_1_37_1.1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention Is All You Need. https:\/\/doi.org\/10.48550\/ARXIV.1706.03762 10.48550\/ARXIV.1706.03762"},{"key":"#cr-split#-e_1_3_2_1_37_1.2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention Is All You Need. https:\/\/doi.org\/10.48550\/ARXIV.1706.03762"},{"key":"e_1_3_2_1_38_1","volume-title":"Bowman","author":"Wang Alex","year":"2018","unstructured":"Alex Wang , Amanpreet Singh , Julian Michael , Felix Hill , Omer Levy , and Samuel R . Bowman . 2018 . GLUE : A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding . https:\/\/doi.org\/10.48550\/ARXIV.1804.07461 10.48550\/ARXIV.1804.07461 Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. https:\/\/doi.org\/10.48550\/ARXIV.1804.07461"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.521"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.749"},{"key":"e_1_3_2_1_41_1","volume-title":"Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks. arXiv preprint arXiv:2204.04596","author":"Yang Haoran","year":"2022","unstructured":"Haoran Yang , Piji Li , and Wai Lam . 2022. Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks. arXiv preprint arXiv:2204.04596 ( 2022 ). Haoran Yang, Piji Li, and Wai Lam. 2022. Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks. arXiv preprint arXiv:2204.04596 (2022)."}],"event":{"name":"CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management","location":"Birmingham United Kingdom","acronym":"CIKM '23","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 32nd ACM International Conference on Information and Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3583780.3614904","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3583780.3614904","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:36:43Z","timestamp":1750178203000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3583780.3614904"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,21]]},"references-count":56,"alternative-id":["10.1145\/3583780.3614904","10.1145\/3583780"],"URL":"https:\/\/doi.org\/10.1145\/3583780.3614904","relation":{},"subject":[],"published":{"date-parts":[[2023,10,21]]},"assertion":[{"value":"2023-10-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}