{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T04:08:55Z","timestamp":1750392535335,"version":"3.41.0"},"reference-count":65,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>Multiple machine learning (ML) models are often incorporated into real-world ML systems. However, updating an individual model in these ML systems frequently results in regression errors, where the new model performs worse than the old model for some inputs. While model-level regression errors have been widely studied, little is known about how regression errors propagate at system level. To address this gap, we propose RegTrieve, a novel retrieval-enhanced ensemble approach to reduce regression errors at both model and system level. Our evaluation across various model update scenarios shows that RegTrieve reduces system-level regression errors with almost no impact on system accuracy, outperforming all baselines by 20.43% on average.<\/jats:p>","DOI":"10.1145\/3729358","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1960-1982","source":"Crossref","is-referenced-by-count":0,"title":["RegTrieve: Reducing System-Level Regression Errors for Machine Learning Systems via Retrieval-Enhanced Ensemble"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-4118-7376","authenticated-orcid":false,"given":"Junming","family":"Cao","sequence":"first","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-5684-770X","authenticated-orcid":false,"given":"Xuwen","family":"Xiang","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8982-1483","authenticated-orcid":false,"given":"Mingfei","family":"Cheng","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7238-7492","authenticated-orcid":false,"given":"Bihuan","family":"Chen","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4004-9032","authenticated-orcid":false,"given":"Xinyan","family":"Wang","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-1634-9721","authenticated-orcid":false,"given":"You","family":"Lu","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-4195-0122","authenticated-orcid":false,"given":"Chaofeng","family":"Sha","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1288-6502","authenticated-orcid":false,"given":"Xiaofei","family":"Xie","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3376-2581","authenticated-orcid":false,"given":"Xin","family":"Peng","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 33rd ACM\/IEEE International Conference on Automated Software Engineering. 143\u2013154","author":"Abdessalem Raja Ben","year":"2018","unstructured":"Raja Ben Abdessalem, Annibale Panichella, Shiva Nejati, Lionel C. Briand, and Thomas Stifter. 2018. Testing Autonomous Cars for Feature Interaction Failures Using Many-Objective Search. In Proceedings of the 33rd ACM\/IEEE International Conference on Automated Software Engineering. 143\u2013154."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 88\u2013100","author":"Abdessalem Raja Ben","year":"2020","unstructured":"Raja Ben Abdessalem, Annibale Panichella, Shiva Nejati, Lionel C. Briand, and Thomas Stifter. 2020. Automated Repair of Feature Interaction Failures in Automated Driving Systems. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 88\u2013100."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 11th International Conference on Learning Representations.","author":"Allen-Zhu Zeyuan","year":"2023","unstructured":"Zeyuan Allen-Zhu and Yuanzhi Li. 2023. Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning. In Proceedings of the 11th International Conference on Learning Representations."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1016\/j.jet.2019.01.004","article-title":"Strategy-proof Pareto-improvement","volume":"181","author":"Alva Samson","year":"2019","unstructured":"Samson Alva and Vikram Manjunath. 2019. Strategy-proof Pareto-improvement. Journal of Economic Theory, 181 (2019), 121\u2013142.","journal-title":"Journal of Economic Theory"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice. 291\u2013300","author":"Amershi Saleema","year":"2019","unstructured":"Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice. 291\u2013300."},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1109\/MS.2021.3134386","article-title":"Feature Interactions on Steroids: On the Composition of ML Models","volume":"39","author":"Apel Sven","year":"2022","unstructured":"Sven Apel, Christian K\u00e4stner, and Eunsuk Kang. 2022. Feature Interactions on Steroids: On the Composition of ML Models. IEEE Software, 39, 3 (2022), 120\u2013124.","journal-title":"IEEE Software"},{"key":"e_1_2_1_7_1","unstructured":"Anonymous Authors. 2024. RegTrieve: Reducing System-Level Regression Errors for Machine Learning Systems. https:\/\/sites.google.com\/view\/regtrieve\/home"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence. 2429\u20132437","author":"Bansal Gagan","year":"2019","unstructured":"Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, and Eric Horvitz. 2019. Updates in human-ai teams: Understanding and addressing the performance\/compatibility tradeoff. In Proceedings of the AAAI Conference on Artificial Intelligence. 2429\u20132437."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1743\u20131751","author":"Bernardi Lucas","year":"2019","unstructured":"Lucas Bernardi, Themistoklis Mavridis, and Pablo Estevez. 2019. 150 Successful Machine Learning Models: 6 Lessons Learned at Booking.Com. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1743\u20131751."},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track). 538\u2013551","author":"Caciolai Andrea","year":"2023","unstructured":"Andrea Caciolai, Verena Weber, Tobias Falke, Alessandro Pedrani, and Davide Bernardi. 2023. Regression-free model updates for spoken language understanding. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track). 538\u2013551."},{"key":"e_1_2_1_11_1","volume-title":"Cascade R-CNN: High quality object detection and instance segmentation","author":"Cai Zhaowei","year":"2019","unstructured":"Zhaowei Cai and Nuno Vasconcelos. 2019. Cascade R-CNN: High quality object detection and instance segmentation. IEEE transactions on pattern analysis and machine intelligence, 43, 5 (2019), 1483\u20131498."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 258\u2013270","author":"Cao Junming","year":"2023","unstructured":"Junming Cao, Bihuan Chen, Longjie Hu, Jie Gao, Kaifeng Huang, Xuezhi Song, and Xin Peng. 2023. Characterizing the Complexity and Its Impact on Testing in ML-Enabled Systems : A Case Sutdy on Rasa. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. 258\u2013270."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 112\u2013124","author":"Chen Lingchao","year":"2020","unstructured":"Lingchao Chen, Foyzul Hassan, Xiaoyin Wang, and Lingming Zhang. 2020. Taming behavioral backward incompatibilities via cross-project testing and analysis. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 112\u2013124."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the International Conference on Machine Learning. 1341\u20131350","author":"Collobert Ronan","year":"2019","unstructured":"Ronan Collobert, Awni Hannun, and Gabriel Synnaeve. 2019. A fully differentiable beam search decoder. In Proceedings of the International Conference on Machine Learning. 1341\u20131350."},{"key":"e_1_2_1_15_1","unstructured":"Matthijs Douze Alexandr Guzhva Chengqi Deng Jeff Johnson Gergely Szilvasy Pierre-Emmanuel Mazar\u00e9 Maria Lomeli Lucas Hosseini and Herv\u00e9 J\u00e9gou. 2024. The Faiss library. arXiv preprint arXiv:2401.08281."},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition. 3354\u20133361","author":"Geiger Andreas","year":"2012","unstructured":"Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3354\u20133361."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Qi Guo Xiaohong Li Xiaofei Xie Shangqing Liu Ze Tang Ruitao Feng Junjie Wang Jidong Ge and Lei Bu. 2024. FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion. arXiv preprint arXiv:2404.01554.","DOI":"10.1145\/3650212.3652130"},{"key":"e_1_2_1_18_1","unstructured":"Christian K\u00e4stner. 2022. Machine Learning in Production: From Models to Systems. https:\/\/ckaestne.medium.com\/machine-learning-in-production-from-models-to-systems-e1422ec7cd65"},{"key":"e_1_2_1_19_1","first-page":"2","article-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics","volume":"1","author":"Ming-Wei Chang Jacob Devlin","year":"2019","unstructured":"Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1, 2.","journal-title":"Human Language Technologies."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 8th International Conference on Learning Representations.","author":"Khandelwal Urvashi","year":"2020","unstructured":"Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. 2020. Generalization through Memorization: Nearest Neighbor Language Models. In Proceedings of the 8th International Conference on Learning Representations."},{"key":"e_1_2_1_21_1","volume-title":"SUMBT: Slot-utterance matching for universal and scalable belief tracking. arXiv preprint arXiv:1907.07421.","author":"Lee Hwaran","year":"2019","unstructured":"Hwaran Lee, Jinsik Lee, and Tae-Yoon Kim. 2019. SUMBT: Slot-utterance matching for universal and scalable belief tracking. arXiv preprint arXiv:1907.07421."},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 19th Annual Conference of the International Speech Communication Association. 3459\u20133463","author":"Li Chia-Hsuan","year":"2018","unstructured":"Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, and Hung-yi Lee. 2018. Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension. In Proceedings of the 19th Annual Conference of the International Speech Communication Association. 3459\u20133463."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 45th IEEE\/ACM International Conference on Software Engineering. 1187\u20131199","author":"Li Zenan","year":"2023","unstructured":"Zenan Li, Maorun Zhang, Jingwei Xu, Yuan Yao, Chun Cao, Taolue Chen, Xiaoxing Ma, and Jian L\u00fc. 2023. Lightweight Approaches to DNN Regression Error Reduction: An Uncertainty Alignment Perspective. In Proceedings of the 45th IEEE\/ACM International Conference on Software Engineering. 1187\u20131199."},{"key":"e_1_2_1_24_1","unstructured":"Zichuan Lin Jing Huang Bowen Zhou Xiaodong He and Tengyu Ma. 2021. Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System. arXiv preprint arXiv:2106.04835."},{"key":"e_1_2_1_25_1","volume-title":"Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6227\u20136240","author":"Lu Shuai","year":"2022","unstructured":"Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, and Alexey Svyatkovskiy. 2022. ReACC: A Retrieval-Augmented Code Completion Framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6227\u20136240."},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Ali Modarressi Mohsen Fayyaz Yadollah Yaghoobzadeh and Mohammad Taher Pilehvar. 2022. GlobEnc: Quantifying global token attribution by incorporating the whole encoder layer in transformers. arXiv preprint arXiv:2205.03286.","DOI":"10.18653\/v1\/2022.naacl-main.19"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. 1017\u20131025","author":"Nushi Besmira","year":"2017","unstructured":"Besmira Nushi, Ece Kamar, Eric Horvitz, and Donald Kossmann. 2017. On human intellect and machine failures: Troubleshooting integrative machine learning systems. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. 1017\u20131025."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the IEEE international conference on acoustics, speech and signal processing. 5206\u20135210","author":"Panayotov Vassil","year":"2015","unstructured":"Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: an asr corpus based on public domain audio books. In Proceedings of the IEEE international conference on acoustics, speech and signal processing. 5206\u20135210."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems. 10386\u201310393","author":"Pang Su","year":"2020","unstructured":"Su Pang, Daniel Morris, and Hayder Radha. 2020. CLOCs: Camera-LiDAR object candidates fusion for 3D object detection. In Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems. 10386\u201310393."},{"key":"e_1_2_1_31_1","volume-title":"Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang.","author":"Rizwan Parvez Md.","year":"2021","unstructured":"Md. Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Retrieval Augmented Code Generation and Summarization. In Proceedings of the Findings of the Association for Computational Linguistics. 2719\u20132734."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 26th Symposium on Operating Systems Principles. 1\u201318","author":"Pei Kexin","year":"2017","unstructured":"Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. In Proceedings of the 26th Symposium on Operating Systems Principles. 1\u201318."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1240\u20131250","author":"Peng Zi","year":"2020","unstructured":"Zi Peng, Jinqiu Yang, Tse-Hsun (Peter) Chen, and Lei Ma. 2020. A First Look at the Integration of Machine Learning Models in Complex Autonomous Driving Systems: A Case Study on Apollo. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1240\u20131250."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the International conference on machine learning. 28492\u201328518","author":"Radford Alec","year":"2023","unstructured":"Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In Proceedings of the International conference on machine learning. 28492\u201328518."},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Pranav Rajpurkar Robin Jia and Percy Liang. 2018. Know what you don\u2019t know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822.","DOI":"10.18653\/v1\/P18-2124"},{"key":"e_1_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Pranav Rajpurkar Jian Zhang Konstantin Lopyrev and Percy Liang. 2016. Squad: 100 000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.","DOI":"10.18653\/v1\/D16-1264"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1711\u20131715","author":"Song Xuezhi","year":"2022","unstructured":"Xuezhi Song, Yun Lin, Yijian Wu, Yifan Zhang, Siang Hwee Ng, Xin Peng, Jin Song Dong, and Hong Mei. 2022. RegMiner: Mining Replicable Regression Dataset from Code Repositories. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1711\u20131715."},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3272\u20133280","author":"Srivastava Megha","year":"2020","unstructured":"Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, and Eric Horvitz. 2020. An Empirical Analysis of Backward Compatibility in Machine Learning Systems. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3272\u20133280."},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. 8004\u20138008","author":"Su Dan","year":"2020","unstructured":"Dan Su and Pascale Fung. 2020. Improving Spoken Question Answering Using Contextualized Word Representation. In Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. 8004\u20138008."},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 297\u2013310","author":"Takanobu Ryuichi","year":"2020","unstructured":"Ryuichi Takanobu, Qi Zhu, Jinchao Li, Baolin Peng, Jianfeng Gao, and Minlie Huang. 2020. Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical Analysis of System-wise Evaluation. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 297\u2013310."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 38th IEEE\/ACM International Conference on Automated Software Engineering. 421\u2013433","author":"Tang Ze","year":"2023","unstructured":"Ze Tang, Jidong Ge, Shangqing Liu, Tingwei Zhu, Tongtong Xu, Liguo Huang, and Bin Luo. 2023. Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases. In Proceedings of the 38th IEEE\/ACM International Conference on Automated Software Engineering. 421\u2013433."},{"key":"e_1_2_1_42_1","unstructured":"Hugging Face Team. 2022. Hugging Face Website. https:\/\/huggingface.co\/"},{"key":"e_1_2_1_43_1","unstructured":"Hugging Face Team. 2024. Multinomial Sampling. https:\/\/huggingface.co\/docs\/transformers\/generation_strategies##multinomial-sampling"},{"key":"e_1_2_1_44_1","unstructured":"Hugging Face Team. 2024. Transformers Decoding Strategies. https:\/\/huggingface.co\/docs\/transformers\/main\/en\/generation_strategies##decoding-strategies"},{"key":"e_1_2_1_45_1","unstructured":"LangChain Team. 2024. Caching Embeddings. https:\/\/python.langchain.com\/v0.1\/docs\/modules\/data_connection\/text_embedding\/caching_embeddings\/"},{"key":"e_1_2_1_46_1","unstructured":"LangChain Team. 2024. Contextual Compression. https:\/\/python.langchain.com\/v0.1\/docs\/modules\/data_connection\/retrievers\/contextual_compression\/"},{"key":"e_1_2_1_47_1","unstructured":"YOLO Team. 2022. YOLO-v5. https:\/\/zenodo.org\/records\/7347926"},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering. 1111\u20131121","author":"Tokui Shogo","year":"2022","unstructured":"Shogo Tokui, Susumu Tokumoto, Akihito Yoshii, Fuyuki Ishikawa, Takao Nakagawa, Kazuki Munakata, and Shinji Kikuchi. 2022. Neurecover: Regression-controlled repair of deep neural networks with training history. In Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering. 1111\u20131121."},{"key":"e_1_2_1_49_1","unstructured":"JC Torres. 2022. Galaxy S10 5G update reportedly breaks face recognition. https:\/\/www.slashgear.com\/galaxy-s10-5g-update-reportedly-breaks-face-recognition-26684006\/"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems. 116\u2013128","author":"Tr\u00e4uble Frederik","year":"2021","unstructured":"Frederik Tr\u00e4uble, Julius von K\u00fcgelgen, Matth\u00e4us Kleindessner, Francesco Locatello, Bernhard Sch\u00f6lkopf, and Peter Gehler. 2021. Backward-Compatible Prediction Updates: A Probabilistic Approach. In Proceedings of the Advances in Neural Information Processing Systems. 116\u2013128."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the Twelfth International Conference on Learning Representations.","author":"Wan Fanqi","year":"2024","unstructured":"Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, and Shuming Shi. 2024. Knowledge Fusion of Large Language Models. In Proceedings of the Twelfth International Conference on Learning Representations."},{"key":"e_1_2_1_52_1","doi-asserted-by":"crossref","unstructured":"Changhan Wang Yun Tang Xutai Ma Anne Wu Sravya Popuri Dmytro Okhonko and Juan Pino. 2020. Fairseq S2T: Fast speech-to-text modeling with fairseq. arXiv preprint arXiv:2010.05171.","DOI":"10.18653\/v1\/2020.aacl-demo.6"},{"key":"e_1_2_1_53_1","doi-asserted-by":"crossref","first-page":"2421","DOI":"10.1109\/TSE.2019.2949568","article-title":"Explaining regressions via alignment slicing and mending","volume":"47","author":"Wang Haijun","year":"2019","unstructured":"Haijun Wang, Yun Lin, Zijiang Yang, Jun Sun, Yang Liu, Jinsong Dong, Qinghua Zheng, and Ting Liu. 2019. Explaining regressions via alignment slicing and mending. IEEE Transactions on Software Engineering, 47, 11 (2019), 2421\u20132437.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_54_1","unstructured":"Chien-Sheng Wu Andrea Madotto Ehsan Hosseini-Asl Caiming Xiong Richard Socher and Pascale Fung. 2019. Transferable multi-domain state generator for task-oriented dialogue systems. arXiv preprint arXiv:1905.08743."},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the 35th Conference on Neural Information Processing Systems. 11745\u201311756","author":"Wu Ruihan","unstructured":"Ruihan Wu, Chuan Guo, Awni Y. Hannun, and Laurens van der Maaten. 2021. Fixes That Fail: Self-Defeating Improvements in Machine-Learning Systems. In Proceedings of the 35th Conference on Neural Information Processing Systems. 11745\u201311756."},{"key":"e_1_2_1_56_1","volume-title":"Alolika Gon, and Preethi Raghavan.","author":"Wu Yijing","year":"2023","unstructured":"Yijing Wu, SaiKrishna Rallabandi, Ravisutha Srinivasamurthy, Parag Pravin Dakle, Alolika Gon, and Preethi Raghavan. 2023. HeySQuAD: A Spoken Question Answering Dataset. arXiv preprint arXiv:2304.13689."},{"key":"e_1_2_1_57_1","unstructured":"Yuqing Xie Yi-An Lai Yuanjun Xiong Yi Zhang and Stefano Soatto. 2021. Regression Bugs Are In Your Model! Measuring Reducing and Analyzing Regressions In NLP Model Updates. arXiv preprint arXiv:2105.03048."},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14299\u201314308","author":"Yan Sijie","year":"2021","unstructured":"Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, and Stefano Soatto. 2021. Positive-congruent training: Towards regression-free model updates. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14299\u201314308."},{"key":"e_1_2_1_59_1","doi-asserted-by":"crossref","first-page":"3337","DOI":"10.3390\/s18103337","article-title":"Second: Sparsely embedded convolutional detection","volume":"18","author":"Yan Yan","year":"2018","unstructured":"Yan Yan, Yuxing Mao, and Bo Li. 2018. Second: Sparsely embedded convolutional detection. Sensors, 18, 10 (2018), 3337.","journal-title":"Sensors"},{"key":"e_1_2_1_60_1","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1002\/stv.430","article-title":"Regression Testing Minimization, Selection and Prioritization: A Survey","volume":"22","author":"Yoo Shin","year":"2012","unstructured":"Shin Yoo and Mark Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Software Testing, Verification and Reliability, 22, 2 (2012), 67\u2013120.","journal-title":"Software Testing, Verification and Reliability"},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal. 7793\u20137797","author":"You Chenyu","year":"2021","unstructured":"Chenyu You, Nuo Chen, and Yuexian Zou. 2021. Knowledge Distillation for Improved Accuracy in Spoken Question Answering. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal. 7793\u20137797."},{"key":"e_1_2_1_62_1","volume-title":"Proceedings of the IEEE\/ACM 45th International Conference on Software Engineering. 82\u201394","author":"You Hanmo","year":"2023","unstructured":"Hanmo You, Zan Wang, Junjie Chen, Shuang Liu, and Shuochuan Li. 2023. Regression fuzzing for deep learning systems. In Proceedings of the IEEE\/ACM 45th International Conference on Software Engineering. 82\u201394."},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the International Symposium on System and Software Reliability. 137\u2013141","author":"Zhang Jihu","year":"2016","unstructured":"Jihu Zhang, Xiaochuan Jing, Wei Zhang, Haipeng Wang, and Yunwei Dong. 2016. Improve the Quality of ARC Systems Based on the Metamorphic Testing. In Proceedings of the International Symposium on System and Software Reliability. 137\u2013141."},{"key":"e_1_2_1_64_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TSE.2019.2962027","article-title":"Machine Learning Testing: Survey, Landscapes and Horizons","volume":"48","author":"Zhang Jie M.","year":"2022","unstructured":"Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2022. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, 48, 1 (2022), 1\u201336.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_65_1","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 142\u2013149","author":"Zhu Qi","year":"2020","unstructured":"Qi Zhu, Zheng Zhang, Yan Fang, Xiang Li, Ryuichi Takanobu, Jinchao Li, Baolin Peng, Jianfeng Gao, Xiaoyan Zhu, and Minlie Huang. 2020. ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 142\u2013149."}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3729358","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:27:46Z","timestamp":1750346866000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3729358"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":65,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3729358"],"URL":"https:\/\/doi.org\/10.1145\/3729358","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}