{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T02:06:40Z","timestamp":1740103600503,"version":"3.37.3"},"reference-count":37,"publisher":"Wiley","license":[{"start":{"date-parts":[[2020,11,14]],"date-time":"2020-11-14T00:00:00Z","timestamp":1605312000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Tianjin Key Research and Development Plan","award":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"],"award-info":[{"award-number":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"]}]},{"name":"2019 Tianjin New Generation AI Technology Key Project","award":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"],"award-info":[{"award-number":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"]}]},{"DOI":"10.13039\/501100002367","name":"Chinese Academy of Sciences","doi-asserted-by":"publisher","award":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"],"award-info":[{"award-number":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"]}],"id":[{"id":"10.13039\/501100002367","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006606","name":"Natural Science Foundation of Tianjin City","doi-asserted-by":"publisher","award":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"],"award-info":[{"award-number":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"]}],"id":[{"id":"10.13039\/501100006606","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Civil Aviation Safety Capacity Building Foundation of China","award":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"],"award-info":[{"award-number":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"],"award-info":[{"award-number":["20YFZCGX00680","19ZXZNGX00090","KFZD-SW-440","19JCYBJC15500","PESA2019074","PESA2019073","PESA2018079","U1533104","61601467","61872202"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Wireless Communications and Mobile Computing"],"published-print":{"date-parts":[[2020,11,14]]},"abstract":"<jats:p>System logs can record the system status and important events during system operation in detail. Detecting anomalies in the system logs is a common method for modern large-scale distributed systems. Yet threshold-based classification models used for anomaly detection output only two values: normal or abnormal, which lacks probability of estimating whether the prediction results are correct. In this paper, a statistical learning algorithm Venn-Abers predictor is adopted to evaluate the confidence of prediction results in the field of system log anomaly detection. It is able to calculate the probability distribution of labels for a set of samples and provide a quality assessment of predictive labels to some extent. Two Venn-Abers predictors LR-VA and SVM-VA have been implemented based on Logistic Regression and Support Vector Machine, respectively. Then, the differences among different algorithms are considered so as to build a multimodel fusion algorithm by Stacking. And then a Venn-Abers predictor based on the Stacking algorithm called Stacking-VA is implemented. The performances of four types of algorithms (unimodel, Venn-Abers predictor based on unimodel, multimodel, and Venn-Abers predictor based on multimodel) are compared in terms of validity and accuracy. Experiments are carried out on a log dataset of the Hadoop Distributed File System (HDFS). For the comparative experiments on unimodels, the results show that the validities of LR-VA and SVM-VA are better than those of the two corresponding underlying models. Compared with the underlying model, the accuracy of the SVM-VA predictor is better than that of LR-VA predictor, and more significantly, the recall rate increases from 81% to 94%. In the case of experiments on multiple models, the algorithm based on Stacking multimodel fusion is significantly superior to the underlying classifier. The average accuracy of Stacking-VA is larger than 0.95, which is more stable than the prediction results of LR-VA and SVM-VA. Experimental results show that the Venn-Abers predictor is a flexible tool that can make accurate and valid probability predictions in the field of system log anomaly detection.<\/jats:p>","DOI":"10.1155\/2020\/8827185","type":"journal-article","created":{"date-parts":[[2020,11,16]],"date-time":"2020-11-16T08:59:51Z","timestamp":1605517191000},"page":"1-12","source":"Crossref","is-referenced-by-count":3,"title":["Valid Probabilistic Anomaly Detection Models for System Logs"],"prefix":"10.1155","volume":"2020","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8179-441X","authenticated-orcid":true,"given":"Chunbo","family":"Liu","sequence":"first","affiliation":[{"name":"Information Security Evaluation Center, Civil Aviation University of China, Tianjin 300300, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2051-0799","authenticated-orcid":true,"given":"Lanlan","family":"Pan","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China"}]},{"given":"Zhaojun","family":"Gu","sequence":"additional","affiliation":[{"name":"Information Security Evaluation Center, Civil Aviation University of China, Tianjin 300300, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2776-5703","authenticated-orcid":true,"given":"Jialiang","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1827-0028","authenticated-orcid":true,"given":"Yitong","family":"Ren","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3252-9254","authenticated-orcid":true,"given":"Zhi","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Cyber Science, Nankai University, Tianjin 300350, China"}]}],"member":"311","reference":[{"key":"1","doi-asserted-by":"publisher","DOI":"10.3390\/s150202774"},{"first-page":"59","article-title":"Histogram-based outlier score (HBOS) a fast unsupervised anomaly detection algorithm","author":"M. Goldstein","key":"2"},{"key":"3","doi-asserted-by":"crossref","DOI":"10.31979\/etd.znsb-bw4d","volume-title":"Anomaly Detection for Application Log Data","author":"A. Grover","year":"2018"},{"key":"4","doi-asserted-by":"publisher","DOI":"10.1145\/335191.335437"},{"key":"5","doi-asserted-by":"publisher","DOI":"10.1145\/1541880.1541882"},{"volume-title":"Self-Organizing Maps","year":"2012","author":"T. Kohonen","key":"6"},{"key":"7","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2009.09.015"},{"author":"I. Goodfellow","key":"8","article-title":"NIPS 2016 Tutorial: Generative adversarial networks"},{"key":"9","doi-asserted-by":"publisher","DOI":"10.1109\/icse-seip.2019.00021"},{"key":"10","doi-asserted-by":"publisher","DOI":"10.1145\/1629575.1629587"},{"first-page":"1","article-title":"Mining invariants from console logs for system problem detection","author":"J. Lou","key":"11"},{"key":"12","doi-asserted-by":"publisher","DOI":"10.1109\/issre.2016.21"},{"key":"13","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3134015"},{"issue":"11","key":"14","first-page":"3204","article-title":"Time series anomaly detection method based on frequent pattern discovery","volume":"38","author":"H. Li","year":"2017","journal-title":"Journal of Computer Applications"},{"key":"15","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-34637-9_5"},{"key":"16","doi-asserted-by":"publisher","DOI":"10.1007\/s10796-020-10026-3"},{"key":"17","first-page":"625","article-title":"Transcend: detecting concept drift in malware classification models","volume-title":"USENIX Security Symposium","author":"R. Jordaney","year":"2017"},{"first-page":"721","article-title":"Fortifying botnet classification based on Venn-Abers prediction","author":"W. Zhi","key":"18"},{"key":"19","doi-asserted-by":"publisher","DOI":"10.3390\/electronics9020232"},{"key":"20","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2012.07.034"},{"key":"21","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33412-2_19"},{"key":"22","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-23960-1_56"},{"key":"23","doi-asserted-by":"publisher","DOI":"10.1007\/s10472-013-9367-5"},{"key":"24","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2014.11.046"},{"key":"25","first-page":"15","article-title":"Inductive Venn-Abers predictive distribution","volume-title":"Conformal and Probabilistic Prediction and Applications","author":"I. Nouretdinov","year":"2018"},{"volume-title":"Venn-Abers Predictors","year":"2012","author":"V. Vovk","key":"26"},{"key":"27","doi-asserted-by":"publisher","DOI":"10.1109\/dsc50466.2020.00063"},{"key":"28","doi-asserted-by":"publisher","DOI":"10.1007\/BF00058655"},{"key":"29","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"30","doi-asserted-by":"publisher","DOI":"10.1016\/S0893-6080(05)80023-1"},{"issue":"2","key":"31","first-page":"197","article-title":"Detection method of Android malware based on multi-feature and Stacking algorithm","volume":"27","author":"J. Sheng","year":"2018","journal-title":"Computer Systems & Applications"},{"key":"32","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"first-page":"28","article-title":"Calibrating probability estimation trees using Venn-Abers predictors","author":"U. Johansson","key":"33"},{"key":"34","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177728423"},{"key":"35","first-page":"892","article-title":"Large-scale probabilistic predictors with and without guarantees of validity","volume-title":"Advances in Neural Information Processing Systems","author":"V. Vovk","year":"2015"},{"key":"36","article-title":"The hadoop distributed file system: architecture and design","volume":"11","author":"D. Borthakur","year":"2007","journal-title":"Hadoop Project Website"},{"first-page":"1","article-title":"Calibrated multi-probabilistic prediction as a defense against adversarial attacks","author":"J. Peck","key":"37"}],"container-title":["Wireless Communications and Mobile Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/wcmc\/2020\/8827185.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/wcmc\/2020\/8827185.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/wcmc\/2020\/8827185.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,16]],"date-time":"2020-11-16T09:00:09Z","timestamp":1605517209000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/wcmc\/2020\/8827185\/"}},"subtitle":[],"editor":[{"given":"Weizhi","family":"Meng","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,11,14]]},"references-count":37,"alternative-id":["8827185","8827185"],"URL":"https:\/\/doi.org\/10.1155\/2020\/8827185","relation":{},"ISSN":["1530-8677","1530-8669"],"issn-type":[{"type":"electronic","value":"1530-8677"},{"type":"print","value":"1530-8669"}],"subject":[],"published":{"date-parts":[[2020,11,14]]}}}