{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T21:54:21Z","timestamp":1740174861270,"version":"3.37.3"},"reference-count":17,"publisher":"Wiley","license":[{"start":{"date-parts":[[2021,3,9]],"date-time":"2021-03-09T00:00:00Z","timestamp":1615248000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62072342","61672392"],"award-info":[{"award-number":["62072342","61672392"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Scientific Programming"],"published-print":{"date-parts":[[2021,3,9]]},"abstract":"<jats:p>The log analysis-based system fault diagnosis method can help engineers analyze the fault events generated by the system. The K-means algorithm can perform log analysis well and does not require a lot of prior knowledge, but the K-means-based system fault diagnosis method needs to be improved in both efficiency and accuracy. To solve this problem, we propose a system fault diagnosis method based on a reclustering algorithm. First, we propose a log vectorization method based on the PV-DM language model to obtain low-dimensional log vectors which can provide effective data support for the subsequent fault diagnosis; then, we improve the K-means algorithm and make the effect of K-means algorithm based log clustering; finally, we propose a reclustering method based on keywords\u2019 extraction to improve the accuracy of fault diagnosis. We use system log data generated by two supercomputers to verify our method. The experimental results show that compared with the traditional K-means method, our method can improve the accuracy of fault diagnosis while ensuring the efficiency of fault diagnosis.<\/jats:p>","DOI":"10.1155\/2021\/6617882","type":"journal-article","created":{"date-parts":[[2021,3,9]],"date-time":"2021-03-09T18:50:09Z","timestamp":1615315809000},"page":"1-8","source":"Crossref","is-referenced-by-count":4,"title":["A System Fault Diagnosis Method with a Reclustering Algorithm"],"prefix":"10.1155","volume":"2021","author":[{"given":"Zhe","family":"Yang","sequence":"first","affiliation":[{"name":"School of Computer Science, Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0471-0021","authenticated-orcid":true,"given":"Shi","family":"Ying","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8723-0970","authenticated-orcid":true,"given":"Bingming","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University, Wuhan, China"}]},{"given":"Yiyao","family":"Li","sequence":"additional","affiliation":[{"name":"School of Software Engineering, Tongji University, Shanghai, China"}]},{"given":"Bo","family":"Dong","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University, Wuhan, China"}]},{"given":"Jiangyi","family":"Geng","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University, Wuhan, China"}]},{"given":"Ting","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University, Wuhan, China"}]}],"member":"311","reference":[{"first-page":"476","article-title":"Filtering failure logs for a bluegene\/L prototype","author":"Y. Liang","key":"1"},{"first-page":"60","article-title":"An overview of the BlueGene\/L supercomputer","author":"N. R. Adiga","key":"2"},{"first-page":"102","article-title":"Log clustering based problem identification for online service systems","author":"Q. Lin","key":"3"},{"first-page":"149","article-title":"Execution anomaly detection in distributed systems through unstructured log analysis","author":"Q. Fu","key":"4"},{"first-page":"231","article-title":"Mining invariants from console logs for system problem detection","author":"J. G. Lou","key":"5"},{"first-page":"117","article-title":"Detecting large-scale system problems by mining console logs","author":"W. Xu","key":"6"},{"first-page":"402","article-title":"Assisting developers of big data analytics applications when deploying on hadoop clouds","author":"W. Shang","key":"7"},{"first-page":"36","article-title":"Failure diagnosis using decision trees","author":"M. Chen","key":"8"},{"first-page":"377","article-title":"Mining unstructured log files for recurrent fault diagnosis","author":"T. Reidemeister","key":"9"},{"first-page":"375","article-title":"Automated known problem diagnosis with event traces","author":"C. Yuan","key":"10"},{"first-page":"207","article-title":"Experience report: system log analysis for anomaly detection","author":"S. He","key":"11"},{"first-page":"785","article-title":"LogSig: generatingsystemevents from raw textual logs","author":"L. Tangl","key":"12"},{"first-page":"160","article-title":"A unified architecture for natural language processing: deep neural networks with multitask learning","author":"R. Collobert","key":"13"},{"article-title":"Efficient estimation of word representations in vector space","year":"2013","author":"T. Mikolov","key":"14"},{"volume-title":"Data Mining: Concept and Technology","year":"2012","author":"J. Han","key":"15"},{"first-page":"575","article-title":"What supercomputers say: a study of five system logs","author":"S. J. OlinerA","key":"16"},{"first-page":"583","article-title":"Failure prediction in IBM Bluegene\/L event logs","author":"Y. Liang","key":"17"}],"container-title":["Scientific Programming"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/sp\/2021\/6617882.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/sp\/2021\/6617882.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/sp\/2021\/6617882.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,9]],"date-time":"2021-03-09T18:50:10Z","timestamp":1615315810000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/sp\/2021\/6617882\/"}},"subtitle":[],"editor":[{"given":"Pengwei","family":"Wang","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,3,9]]},"references-count":17,"alternative-id":["6617882","6617882"],"URL":"https:\/\/doi.org\/10.1155\/2021\/6617882","relation":{},"ISSN":["1875-919X","1058-9244"],"issn-type":[{"type":"electronic","value":"1875-919X"},{"type":"print","value":"1058-9244"}],"subject":[],"published":{"date-parts":[[2021,3,9]]}}}