{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T18:31:42Z","timestamp":1771698702248,"version":"3.50.1"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Log-based anomaly detection plays a crucial role in ensuring the reliability of systems. While deep learning-based small detection models (SDMs) are efficient, the large language models (LLMs) are accurate and capable of providing explanations. Intuitively, a compelling question arises: Can we seamlessly combine the advantages of both approaches? In this work, we delve into this underexplored research direction and propose CoLA, a novel collaborative log anomaly detection framework. During collaborative inference, an SDM serves as a filter to select potentially anomalous instances, while a downstream LLM acts as an expert to detect anomalies, offer explanations, and refine the SDM. Extensive experiments on three large real-world datasets demonstrate that CoLA significantly outperforms state-of-the-art methods in terms of effectiveness, efficiency, and explainability, while also greatly reducing labor costs.<\/jats:p>","DOI":"10.14778\/3749646.3749668","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T17:55:06Z","timestamp":1757008506000},"page":"3979-3987","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["CoLA: Model Collaboration for Log-Based Anomaly Detection"],"prefix":"10.14778","volume":"18","author":[{"given":"Xuhang","family":"Zhu","sequence":"first","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]},{"given":"Xiu","family":"Tang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]},{"given":"Sai","family":"Wu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]},{"given":"Jichen","family":"Li","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]},{"given":"Haobo","family":"Wang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]},{"given":"Chang","family":"Yao","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]},{"given":"Quanqing","family":"Xu","sequence":"additional","affiliation":[{"name":"OceanBase, Ant Group"}]},{"given":"Gang","family":"Chen","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.14778\/3626292.3626294"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3626719"},{"key":"e_1_2_1_3_1","volume-title":"How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition","author":"Dong Guanting","unstructured":"Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, and Jingren Zhou. 2024. How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition. In ACL. Association for Computational Linguistics, 177\u2013198."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Min Du Feifei Li Guineng Zheng and Vivek Srikumar. 2017. DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. In CCS. ACM 1285\u20131298.","DOI":"10.1145\/3133956.3134015"},{"key":"e_1_2_1_5_1","unstructured":"Abhimanyu Dubey Abhinav Jauhri Abhinav Pandey Abhishek Kadian and et al. 2024. The Llama 3 Herd of Models. arXiv preprint arXiv:2407.21783 (2024)."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/3665844.3665857"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"Zhaopeng Gu Bingke Zhu Guibo Zhu Yingying Chen Ming Tang and Jinqiao Wang. 2024. AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models. In AAAI. 1932\u20131940.","DOI":"10.1609\/aaai.v38i3.27963"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3685800.3685803"},{"key":"e_1_2_1_9_1","volume-title":"Lyu","author":"He Pinjia","year":"2017","unstructured":"Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. 2017. Drain: An Online Log Parsing Approach with Fixed Depth Tree. In ICWS. IEEE, 33\u201340."},{"key":"e_1_2_1_10_1","volume-title":"Lyu","author":"He Shilin","year":"2016","unstructured":"Shilin He, Jieming Zhu, Pinjia He, and Michael R. Lyu. 2016. Experience Report: System Log Analysis for Anomaly Detection. In ISSRE. IEEE Computer Society, 207\u2013218."},{"key":"e_1_2_1_11_1","unstructured":"Dan Hendrycks Collin Burns Steven Basart Andy Zou Mantas Mazeika Dawn Song and Jacob Steinhardt. 2021. Measuring Massive Multitask Language Understanding. In ICLR."},{"key":"e_1_2_1_12_1","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. In ICLR."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588918"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3660768"},{"key":"e_1_2_1_15_1","volume-title":"Log-based Anomaly Detection Without Log Parsing","author":"Le Van-Hoang","unstructured":"Van-Hoang Le and Hongyu Zhang. 2021. Log-based Anomaly Detection Without Log Parsing. In ASE. IEEE, 492\u2013504."},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Van-Hoang Le and Hongyu Zhang. 2022. Log-based Anomaly Detection with Deep Learning: How Far Are We?. In ICSE. ACM 1356\u20131367.","DOI":"10.1145\/3510003.3510155"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367516"},{"key":"e_1_2_1_18_1","volume-title":"Automated Intelligent Healing in Cloud-Scale Data Centers","author":"Li Rui","unstructured":"Rui Li, Zhinan Cheng, Patrick P. C. Lee, Pinghui Wang, Yi Qiang, Lin Lan, Cheng He, Jinlong Lu, Mian Wang, and Xinquan Ding. 2021. Automated Intelligent Healing in Cloud-Scale Data Centers. In SRDS. IEEE, 244\u2013253."},{"key":"e_1_2_1_19_1","volume-title":"AISTATS","volume":"130","author":"Liu Weiyang","year":"2021","unstructured":"Weiyang Liu, Rongmei Lin, Zhen Liu, Li Xiong, Bernhard Sch\u00f6lkopf, and Adrian Weller. 2021. Learning with Hyperspherical Uniformity. In AISTATS, Vol. 130. PMLR, 1180\u20131188."},{"key":"e_1_2_1_20_1","first-page":"4","volume-title":"Proc. ACM Manag. Data 2","author":"Ma Lei","year":"2024","unstructured":"Lei Ma, Lei Cao, Peter M. VanNostrand, Dennis M. Hofmann, Yao Su, and Elke A. Rundensteiner. 2024. Pluto: Sample Selection for Robust Anomaly Detection on Polluted Log Data. Proc. ACM Manag. Data 2, 4 (2024), 203:1\u2013203:25."},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Weibin Meng Ying Liu Yichen Zhu Shenglin Zhang Dan Pei Yuqing Liu Yihao Chen Ruizhi Zhang Shimin Tao Pei Sun and Rong Zhou. 2019. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. In IJCAI. 4739\u20134745.","DOI":"10.24963\/ijcai.2019\/658"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2013.21"},{"key":"e_1_2_1_23_1","volume-title":"Oliner and Jon Stearley","author":"Adam","year":"2007","unstructured":"Adam J. Oliner and Jon Stearley. 2007. What Supercomputers Say: A Study of Five System Logs. In DSN. IEEE Computer Society, 575\u2013584."},{"key":"e_1_2_1_24_1","unstructured":"OpenAI. 2023. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023)."},{"key":"e_1_2_1_25_1","volume-title":"Manning","author":"Pennington Jeffrey","year":"2014","unstructured":"Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP. 1532\u20131543."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2005.10.028"},{"key":"e_1_2_1_27_1","unstructured":"Noam Shazeer Azalia Mirhoseini Krzysztof Maziarz Andy Davis Quoc V. Le Geoffrey E. Hinton and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In ICLR."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/3632093.3632111"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2022.3152527"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.14778\/3681954.3681973"},{"key":"e_1_2_1_31_1","volume-title":"Hashimoto","author":"Taori Rohan","year":"2023","unstructured":"Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-following LLaMA model. https:\/\/github.com\/tatsu-lab\/stanford_alpaca."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588938"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Junyu Wei Guangyan Zhang Junchao Chen Yang Wang Weimin Zheng Tingtao Sun Jiesheng Wu and Jiangwei Jiang. 2023. LogGrep: Fast and Cheap Cloud Log Storage by Exploiting both Static and Runtime Patterns. In EuroSys. ACM 452\u2013468.","DOI":"10.1145\/3552326.3567484"},{"key":"e_1_2_1_34_1","volume-title":"Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, and Ludwig Schmidt.","author":"Wortsman Mitchell","year":"2022","unstructured":"Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, and Ludwig Schmidt. 2022. Robust fine-tuning of zero-shot models. In CVPR. IEEE, 7949\u20137961."},{"key":"e_1_2_1_35_1","volume-title":"Continual Learning for Large Language Models: A Survey. arXiv preprint arXiv:2402.01364","author":"Wu Tongtong","year":"2024","unstructured":"Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu, and Gholamreza Haffari. 2024. Continual Learning for Large Language Models: A Survey. arXiv preprint arXiv:2402.01364 (2024)."},{"key":"e_1_2_1_36_1","volume-title":"Jordan","author":"Xu Wei","year":"2009","unstructured":"Wei Xu, Ling Huang, Armando Fox, David A. Patterson, and Michael I. Jordan. 2009. Detecting large-scale system problems by mining console logs. In SOSP. ACM, 117\u2013132."},{"key":"e_1_2_1_37_1","volume-title":"Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation","author":"Yang Lin","unstructured":"Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021. Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation. In ICSE. IEEE, 1448\u20131460."},{"key":"e_1_2_1_38_1","volume-title":"ECCV","volume":"15139","author":"Yang Yuchen","year":"2024","unstructured":"Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, and Shao-Yuan Lo. 2024. Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models. In ECCV, Vol. 15139. Springer, 304\u2013322."},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Xu Zhang Yong Xu Qingwei Lin Bo Qiao Hongyu Zhang Yingnong Dang Chunyu Xie Xinsheng Yang Qian Cheng Ze Li Junjie Chen Xiaoting He Randolph Yao Jian-Guang Lou Murali Chintalapati Furao Shen and Dongmei Zhang. 2019. Robust log-based anomaly detection on unstable log data. In FSE. ACM 807\u2013817.","DOI":"10.1145\/3338906.3338931"},{"key":"e_1_2_1_40_1","volume-title":"ESTELLE: An Efficient and Cost-effective Cloud Log Engine. In SIGMOD. ACM, 201\u2013213.","author":"Zhang Yupu","year":"2024","unstructured":"Yupu Zhang, Guanglin Cong, Jihan Qu, Ran Xu, Yuan Fu, Weiqi Li, Feiran Hu, Jing Liu, Wenliang Zhang, and Kai Zheng. 2024. ESTELLE: An Efficient and Cost-effective Cloud Log Engine. In SIGMOD. ACM, 201\u2013213."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/3636218.3636225"},{"key":"e_1_2_1_42_1","volume-title":"Patel","author":"Zhang Yunjia","year":"2023","unstructured":"Yunjia Zhang, Avrilia Floratou, Joyce Cahoon, Subru Krishnan, Andreas C. M\u00fcller, Dalitso Banda, Fotis Psallidas, and Jignesh M. Patel. 2023. Schema Matching using Pre-Trained Language Models. In ICDE. IEEE, 1558\u20131571."},{"key":"e_1_2_1_43_1","unstructured":"Wayne Xin Zhao Kun Zhou Junyi Li Tianyi Tang and et al. 2023. A Survey of Large Language Models. arXiv preprint arXiv:2303.18223 (2023)."},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Junhao Zheng Shengjie Qiu Chengming Shi and Qianli Ma. 2025. Towards Lifelong Learning of Large Language Models: A Survey. ACM Comput. Surv. (2025).","DOI":"10.1145\/3716629"},{"key":"e_1_2_1_45_1","volume-title":"LIMA: Less Is More for Alignment. In NeurIPS.","author":"Zhou Chunting","year":"2023","unstructured":"Chunting Zhou, Pengfei Liu, Puxin Xu, Srinivasan Iyer, Jiao Sun, Yuning Mao, and et al. 2023. LIMA: Less Is More for Alignment. In NeurIPS."},{"key":"e_1_2_1_46_1","unstructured":"Tian Zhou Peisong Niu Xue Wang Liang Sun and Rong Jin. 2023. One Fits All: Power General Time Series Analysis by Pretrained LM. In NeurIPS."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3749646.3749668","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T02:57:42Z","timestamp":1757041062000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3749646.3749668"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":46,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.14778\/3749646.3749668"],"URL":"https:\/\/doi.org\/10.14778\/3749646.3749668","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}