{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T08:55:33Z","timestamp":1765961733221,"version":"3.41.0"},"reference-count":85,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,3,15]],"date-time":"2024-03-15T00:00:00Z","timestamp":1710460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972013 and 61932007"],"award-info":[{"award-number":["61972013 and 61932007"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2024,3,31]]},"abstract":"<jats:p>\n            With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models in a more fine-grained manner. Specifically, we propose two modularization approaches named CNNSplitter and GradSplitter, which can decompose a trained convolutional neural network (CNN) model for\n            <jats:italic>N<\/jats:italic>\n            -class classification into\n            <jats:italic>N<\/jats:italic>\n            small reusable modules. Each module recognizes one of the\n            <jats:italic>N<\/jats:italic>\n            classes and contains a part of the convolution kernels of the trained CNN model. Then, the resulting modules can be reused to patch existing CNN models or build new CNN models through composition. The main difference between CNNSplitter and GradSplitter lies in their search methods: the former relies on a genetic algorithm to explore search space, while the latter utilizes a gradient-based search method. Our experiments with three representative CNNs on three widely used public datasets demonstrate the effectiveness of the proposed approaches. Compared with CNNSplitter, GradSplitter incurs less accuracy loss, produces much smaller modules (19.88% fewer kernels), and achieves better results on patching weak models. In particular, experiments on GradSplitter show that (1) by patching weak models, the average improvement in terms of precision, recall, and F1-score is 17.13%, 4.95%, and 11.47%, respectively, and (2) for a new task, compared with the models trained from scratch, reusing modules achieves similar accuracy (the average loss of accuracy is only 2.46%) without a costly training process. Our approaches provide a viable solution to the rapid development and improvement of CNN models.\n          <\/jats:p>","DOI":"10.1145\/3632744","type":"journal-article","created":{"date-parts":[[2023,11,13]],"date-time":"2023-11-13T11:49:08Z","timestamp":1699876148000},"page":"1-39","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Reusing Convolutional Neural Network Models through Modularization and Composition"],"prefix":"10.1145","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0828-5544","authenticated-orcid":false,"given":"Binhang","family":"Qi","sequence":"first","affiliation":[{"name":"SKLSDE, School of Computer Science and Engineering, Beihang University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7654-5574","authenticated-orcid":false,"given":"Hailong","family":"Sun","sequence":"additional","affiliation":[{"name":"SKLSDE, School of Software, Beihang University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3063-9425","authenticated-orcid":false,"given":"Hongyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Chongqing University, Chongqing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9895-4600","authenticated-orcid":false,"given":"Xiang","family":"Gao","sequence":"additional","affiliation":[{"name":"School of Software, Beihang University, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,3,15]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Github. 2022. CNNSplitter. Retrieved from https:\/\/github.com\/qibinhang\/CNNSplitter"},{"key":"e_1_3_1_3_2","unstructured":"Github. 2023. GradSplitter. Retrieved from https:\/\/github.com\/qibinhang\/GradSplitter"},{"key":"e_1_3_1_4_2","first-page":"2549","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Almahairi Amjad","year":"2016","unstructured":"Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, and Aaron Courville. 2016. Dynamic capacity networks. In Proceedings of the International Conference on Machine Learning. PMLR, 2549\u20132558."},{"key":"e_1_3_1_5_2","article-title":"Estimating or propagating gradients through stochastic neurons for conditional computation","author":"Bengio Yoshua","year":"2013","unstructured":"Yoshua Bengio, Nicholas L\u00e9onard, and Aaron C. Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. Retrieved from https:\/\/abs\/1308.3432","journal-title":"Retrieved from"},{"key":"e_1_3_1_6_2","first-page":"2180","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems","author":"Chen Xi","year":"2016","unstructured":"Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Proceedings of the International Conference on Neural Information Processing Systems. 2180\u20132188."},{"key":"e_1_3_1_7_2","article-title":"Palm: Scaling language modeling with pathways","author":"Chowdhery Aakanksha","year":"2022","unstructured":"Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann et\u00a0al. 2022. Palm: Scaling language modeling with pathways. Retrieved from https:\/\/arXiv:2204.02311","journal-title":"R"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1108\/eb026538"},{"key":"e_1_3_1_9_2","first-page":"261","volume-title":"Proceedings of the International Conference on Language Resources and Evaluation","author":"Derczynski Leon","year":"2016","unstructured":"Leon Derczynski. 2016. Complementarity, F-score, and NLP evaluation. In Proceedings of the International Conference on Language Resources and Evaluation. 261\u2013266."},{"key":"e_1_3_1_10_2","first-page":"4171","volume-title":"Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT\u201919)","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT\u201919). Association for Computational Linguistics, 4171\u20134186."},{"key":"e_1_3_1_11_2","first-page":"647","volume-title":"Proceedings of the 31st International Conference on Machine Learning","volume":"32","author":"Donahue Jeff","year":"2014","unstructured":"Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. 2014. DeCAF: A deep convolutional activation feature for generic visual recognition. In Proceedings of the 31st International Conference on Machine Learning, Vol. 32. 647\u2013655."},{"key":"e_1_3_1_12_2","first-page":"5547","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Du Nan","year":"2022","unstructured":"Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat et\u00a0al. 2022. Glam: Efficient scaling of language models with mixture-of-experts. In Proceedings of the International Conference on Machine Learning. PMLR, 5547\u20135569."},{"issue":"55","key":"e_1_3_1_13_2","first-page":"1","article-title":"Neural architecture search: A survey","volume":"20","author":"Elsken Thomas","year":"2019","unstructured":"Thomas Elsken, Jan Hendrik Metzen, Frank Hutter et\u00a0al. 2019. Neural architecture search: A survey. J. Mach. Learn. Res. 20, 55 (2019), 1\u201321.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3395363.3397357"},{"key":"e_1_3_1_15_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Frankle Jonathan","year":"2019","unstructured":"Jonathan Frankle and Michael Carbin. 2019. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the International Conference on Learning Representations. OpenReview.net."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.10.113"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380415"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10710-017-9314-z"},{"key":"e_1_3_1_20_2","article-title":"Learning both weights and connections for efficient neural network","volume":"28","author":"Han Song","year":"2015","unstructured":"Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. Adv. Neural Info. Process. Syst. 28 (2015).","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0950-5849(01)00189-6"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_23_2","unstructured":"Christopher R. Houck Jeff Joines and Michael G. Kay. 1995. A genetic algorithm for function optimization: A Matlab implementation. North Carolina State University Technical Report 1995."},{"key":"e_1_3_1_24_2","article-title":"Binarized neural networks","volume":"29","author":"Hubara Itay","year":"2016","unstructured":"Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. Adv. Neural Info. Process. Syst. 29 (2016).","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00093"},{"key":"e_1_3_1_26_2","first-page":"448","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning. 448\u2013456."},{"key":"e_1_3_1_27_2","unstructured":"Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. University Toronto Toronto Canada Technical Report 2009. [Online]. Available: https:\/\/www.cs.toronto.edu\/~kriz\/learning-features-2009-TR.pdf"},{"key":"e_1_3_1_28_2","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. Adv. Neural Info. Process. Syst. 25 (2012), 1097\u20131105.","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_1_29_2","first-page":"950","volume-title":"Advances in Neural Information Processing Systems","author":"Krogh Anders","year":"1992","unstructured":"Anders Krogh and John A. Hertz. 1992. A simple weight decay can improve generalization. In Advances in Neural Information Processing Systems. 950\u2013957."},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_1_31_2","first-page":"598","volume-title":"Advances in Neural Information Processing Systems","author":"LeCun Yann","year":"1990","unstructured":"Yann LeCun, John S. Denker, and Sara A. Solla. 1990. Optimal brain damage. In Advances in Neural Information Processing Systems. 598\u2013605."},{"key":"e_1_3_1_32_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Lepikhin Dmitry","year":"2021","unstructured":"Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. 2021. {GS}hard: Scaling giant models with conditional computation and automatic sharding. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_33_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Li Hao","year":"2017","unstructured":"Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2017. Pruning filters for efficient ConvNets. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2016.2615100"},{"key":"e_1_3_1_35_2","first-page":"2351","article-title":"Ensemble distillation for robust model fusion in federated learning","volume":"33","author":"Lin Tao","year":"2020","unstructured":"Tao Lin, Lingjing Kong, Sebastian U. Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Adv. Neural Info. Process. Syst. 33 (2020), 2351\u20132363.","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_1_36_2","article-title":"Hierarchical representations for efficient architecture search","author":"Liu Hanxiao","year":"2017","unstructured":"Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. Retrieved from https:\/\/arXiv:1711.00436","journal-title":"R"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2010.04.002"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3236024.3236082"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00045"},{"key":"e_1_3_1_41_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Molchanov Pavlo","year":"2017","unstructured":"Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2017. Pruning convolutional neural networks for resource efficient inference. In Proceedings of the International Conference on Learning Representations. OpenReview.net."},{"key":"e_1_3_1_42_2","unstructured":"Yuval Netzer Tao Wang Adam Coates Alessandro Bissacco Bo Wu and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. NeurIPS Workshop 2011."},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2011.6100062"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409668"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510051"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/361598.361623"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.1976.233797"},{"key":"e_1_3_1_49_2","first-page":"7","volume-title":"Proceedings of the ITT Workshop on Reusability in Programming","author":"Parnas David Lorge","year":"1983","unstructured":"David Lorge Parnas, Paul C. Clements, and David M. Weiss. 1983. Enhancing reusability with information hiding. In Proceedings of the ITT Workshop on Reusability in Programming. 7\u20139."},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3132747.3132785"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1093\/genetics\/155.2.945"},{"key":"e_1_3_1_52_2","article-title":"Patching weak convolutional neural network models through modularization and composition","author":"Qi Binhang","year":"2022","unstructured":"Binhang Qi, Hailong Sun, Xiang Gao, and Hongyu Zhang. 2022. Patching weak convolutional neural network models through modularization and composition. In Proceedings of the 37th International Conference on Automated Software Engineering.","journal-title":"Proceedings of the 37th International Conference on Automated Software Engineering"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00090"},{"key":"e_1_3_1_54_2","article-title":"Modularizing while training: A new paradigm for modularizing DNN models","author":"Qi Binhang","year":"2023","unstructured":"Binhang Qi, Hailong Sun, Hongyu Zhang, Ruobing Zhao, and Xiang Gao. 2023. Modularizing while training: A new paradigm for modularizing DNN models. In Proceedings of the 46th International Conference on Software Engineering.","journal-title":"Proceedings of the 46th International Conference on Software Engineering"},{"key":"e_1_3_1_55_2","unstructured":"rangeetpan. 2020. Decompose a DNN Model into Modules. Retrieved from https:\/\/github.com\/rangeetpan\/decomposeDNNintoModules"},{"key":"e_1_3_1_56_2","unstructured":"rangeetpan. 2022. Decompose a CNN Model into Modules. Retrieved from https:\/\/github.com\/rangeetpan\/Decomposition."},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33014780"},{"key":"e_1_3_1_58_2","first-page":"2902","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Real Esteban","year":"2017","unstructured":"Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V. Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the International Conference on Machine Learning. PMLR, 2902\u20132911."},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1016\/0305-0548(93)E0014-K"},{"key":"e_1_3_1_60_2","first-page":"9075","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Rosenfeld Jonathan S.","year":"2021","unstructured":"Jonathan S. Rosenfeld, Jonathan Frankle, Michael Carbin, and Nir Shavit. 2021. On the predictability of pruning across scales. In Proceedings of the International Conference on Machine Learning. 9075\u20139083."},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_1_62_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Shazeer Noam","year":"2017","unstructured":"Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously large neural networks: The sparsely gated mixture-of-experts layer. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556964"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-019-0197-0"},{"key":"e_1_3_1_65_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_66_2","unstructured":"sksq96. 2021. pytorch-summary. Retrieved from https:\/\/github.com\/sksq96\/pytorch-summary"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE51524.2021.9678586"},{"key":"e_1_3_1_69_2","first-page":"4771","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Suganuma Masanori","year":"2018","unstructured":"Masanori Suganuma, Mete Ozay, and Takayuki Okatani. 2018. Exploiting the potential of standard convolutional autoencoders for image restoration by evolutionary search. In Proceedings of the International Conference on Machine Learning. 4771\u20134780."},{"key":"e_1_3_1_70_2","first-page":"1139","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Sutskever Ilya","year":"2013","unstructured":"Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. 2013. On the importance of initialization and momentum in deep learning. In Proceedings of the International Conference on Machine Learning. 1139\u20131147."},{"key":"e_1_3_1_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_1_72_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180220"},{"key":"e_1_3_1_74_2","unstructured":"tokusumi. 2020. keras-flops. Retrieved from https:\/\/github.com\/tokusumi\/keras-flops"},{"key":"e_1_3_1_75_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00046"},{"key":"e_1_3_1_76_2","article-title":"Model reuse with reduced kernel mean embedding specification","author":"Wu Xi-Zhu","year":"2021","unstructured":"Xi-Zhu Wu, Wenkai Xu, Song Liu, and Zhi-Hua Zhou. 2021. Model reuse with reduced kernel mean embedding specification. IEEE Trans. Knowl. Data Eng. 35, 1 (2021), 699\u2013710.","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"e_1_3_1_77_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.154"},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.1145\/3293882.3330579"},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13244-018-0639-9"},{"key":"e_1_3_1_80_2","article-title":"mplug-owl: Modularization empowers large language models with multimodality","author":"Ye Qinghao","year":"2023","unstructured":"Qinghao Ye, Haiyang Xu, Guohai Xu, Jiabo Ye, Ming Yan, Yiyang Zhou, Junyang Wang, Anwen Hu, Pengcheng Shi, Yaya Shi et\u00a0al. 2023. mplug-owl: Modularization empowers large language models with multimodality. Retrieved from https:\/\/arXiv:2304.14178","journal-title":"R"},{"key":"e_1_3_1_81_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Yin Penghang","year":"2019","unstructured":"Penghang Yin, Jiancheng Lyu, Shuai Zhang, Stanley J. Osher, Yingyong Qi, and Jack Xin. 2019. Understanding straight-through estimator in training activation quantized neural nets. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_82_2","first-page":"3320","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems","author":"Yosinski Jason","year":"2014","unstructured":"Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In Proceedings of the 27th International Conference on Neural Information Processing Systems. 3320\u20133328."},{"key":"e_1_3_1_83_2","doi-asserted-by":"publisher","DOI":"10.5244\/C.30.87"},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238187"},{"key":"e_1_3_1_85_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00257"},{"key":"e_1_3_1_86_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Zhu Michael","year":"2018","unstructured":"Michael Zhu and Suyog Gupta. 2018. To prune, or not to prune: Exploring the efficacy of pruning for model compression. In Proceedings of the International Conference on Learning Representations. OpenReview.net."}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3632744","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3632744","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:51:04Z","timestamp":1750287064000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3632744"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,15]]},"references-count":85,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,31]]}},"alternative-id":["10.1145\/3632744"],"URL":"https:\/\/doi.org\/10.1145\/3632744","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"type":"print","value":"1049-331X"},{"type":"electronic","value":"1557-7392"}],"subject":[],"published":{"date-parts":[[2024,3,15]]},"assertion":[{"value":"2023-04-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-10-31","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-03-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}