{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T08:52:24Z","timestamp":1772787144164,"version":"3.50.1"},"reference-count":30,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,7,28]],"date-time":"2021-07-28T00:00:00Z","timestamp":1627430400000},"content-version":"vor","delay-in-days":208,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51967010"],"award-info":[{"award-number":["51967010"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>The Lanzhou\u2010Xinjiang (Lan\u2010Xin) high\u2010speed railway is one of the principal sections of the railway network in western China, and signal equipment is of great importance in ensuring the safe and efficient operation of the high\u2010speed railway. Over a long period, in the railway operation and maintenance process, the railway signaling and communications department has recorded a large amount of unstructured text information about equipment faults in the form of natural language. However, due to irregularities in the recording methods of these data, it is difficult to use directly. In this paper, a method based on natural language processing (NLP) was adopted to analyze and classify this information. First, the Latent Dirichlet Allocation (LDA) topic model was used to extract the semantic features of the text, which were then expressed in the corresponding topic feature space. Next, the Support Vector Machine (SVM) algorithm was used to construct a signal equipment fault diagnostic model that reduced the impact of sample data imbalance on the classification accuracy. This was compared and analyzed with the traditional Naive Bayes (NB), Logistic Regression (LR), Random Forest (RF), and K\u2010Nearest Neighbor (KNN) algorithms. This study used signal equipment failure text data from the Lan\u2010Xin high\u2010speed railway to conduct experimental analysis and verify the effectiveness of the proposed method. Experiments showed that the accuracy of the SVM classification algorithm could reach 0.84 after being combined with the LDA topic model, which verifies that the natural language processing method can effectively realize the fault diagnosis of signal equipment and has certain guiding significance for the maintenance of field signal equipment.<\/jats:p>","DOI":"10.1155\/2021\/9126745","type":"journal-article","created":{"date-parts":[[2021,7,28]],"date-time":"2021-07-28T18:20:09Z","timestamp":1627496409000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Fault Diagnosis of Signal Equipment on the Lanzhou\u2010Xinjiang High\u2010Speed Railway Using Machine Learning for Natural Language Processing"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0461-5021","authenticated-orcid":false,"given":"Lei","family":"Shi","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9342-0455","authenticated-orcid":false,"given":"Yulin","family":"Zhu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0485-2775","authenticated-orcid":false,"given":"Youpeng","family":"Zhang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9883-1691","authenticated-orcid":false,"given":"Zhongji","family":"Su","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2021,7,28]]},"reference":[{"key":"e_1_2_10_1_2","first-page":"53","article-title":"Fault diagnosis system for vehicle on-board equipment of high-speed railway","volume":"37","author":"Zhao Y.","year":"2015","journal-title":"Journal of the China Railway Society"},{"key":"e_1_2_10_2_2","first-page":"59","article-title":"Intelligent classification of faults of railway signal equipment based on imbalanced text data mining","volume":"40","author":"Yang L. B.","year":"2018","journal-title":"Journal of the China Railway Society"},{"key":"e_1_2_10_3_2","first-page":"80","article-title":"Research on fault feature extraction and diagnosis of railway switches based on PLSA and SVM","volume":"40","author":"Zhong Z. W.","year":"2018","journal-title":"Journal of the China Railway Society"},{"key":"e_1_2_10_4_2","first-page":"56","article-title":"Research of fault feature extraction and diagnosis method for CTCS on-board equipment(OBE) based on labeled-LDA","volume":"41","author":"ShangGuan W.","year":"2018","journal-title":"Journal of the China Railway Society"},{"key":"e_1_2_10_5_2","doi-asserted-by":"crossref","unstructured":"RenY.andPanJ. Z. Optimising ontology stream reasoning with truth maintenance system Proceedings of the 20th AMC CIKM 2011 October 2011 Glasgow UK 831\u2013836.","DOI":"10.1145\/2063576.2063696"},{"key":"e_1_2_10_6_2","doi-asserted-by":"publisher","DOI":"10.3390\/app9235129"},{"key":"e_1_2_10_7_2","first-page":"931","article-title":"Evolution properties of complex networks in terms of the LDA","volume":"48","author":"Zhao Z. J.","year":"2019","journal-title":"Journal of University of Electronic Science and Technology of China"},{"key":"e_1_2_10_8_2","first-page":"676","article-title":"Extracting product aspects and user opinions based on semantic constrained LDA model","volume":"28","author":"Peng Y.","year":"2016","journal-title":"Journal of Software"},{"key":"e_1_2_10_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/access.2020.2997973"},{"key":"e_1_2_10_10_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-020-09549-3"},{"key":"e_1_2_10_11_2","doi-asserted-by":"publisher","DOI":"10.1142\/s021819401840017x"},{"key":"e_1_2_10_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/access.2019.2897475"},{"key":"e_1_2_10_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.06.099"},{"key":"e_1_2_10_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.trc.2020.102627"},{"key":"e_1_2_10_15_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.techfore.2020.120041"},{"key":"e_1_2_10_16_2","doi-asserted-by":"publisher","DOI":"10.3390\/s19173728"},{"key":"e_1_2_10_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2018.02.012"},{"key":"e_1_2_10_18_2","first-page":"41","article-title":"Review of Chinese automatic word segmentation","volume":"55","author":"Feng G. H.","year":"2011","journal-title":"Library and Information Service"},{"key":"e_1_2_10_19_2","first-page":"175","article-title":"A view of Chinese word automatic segmentation research in the Chinese information disposal","volume":"3","author":"Liu Q.","year":"2006","journal-title":"Computer Engineering and Applications"},{"key":"e_1_2_10_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2018.04.017"},{"key":"e_1_2_10_21_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0241701"},{"key":"e_1_2_10_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.02.034"},{"key":"e_1_2_10_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2018.05.006"},{"key":"e_1_2_10_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2019.03.001"},{"key":"e_1_2_10_25_2","first-page":"298","article-title":"Automatic text summarization research based on topic model and information entropy","volume":"41","author":"Li R.","year":"2014","journal-title":"Computer Science"},{"key":"e_1_2_10_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.07.090"},{"key":"e_1_2_10_27_2","doi-asserted-by":"publisher","DOI":"10.3390\/buildings10010002"},{"key":"e_1_2_10_28_2","doi-asserted-by":"publisher","DOI":"10.3390\/app9204402"},{"key":"e_1_2_10_29_2","doi-asserted-by":"publisher","DOI":"10.3390\/jmmp3010011"},{"key":"e_1_2_10_30_2","doi-asserted-by":"publisher","DOI":"10.1162\/089976601300014493"}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/9126745.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/9126745.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/9126745","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,9]],"date-time":"2024-08-09T22:56:10Z","timestamp":1723244170000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/9126745"}},"subtitle":[],"editor":[{"given":"Muhammad","family":"Javaid","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/9126745"],"URL":"https:\/\/doi.org\/10.1155\/2021\/9126745","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"value":"1076-2787","type":"print"},{"value":"1099-0526","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2021-06-29","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-07-20","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-07-28","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"9126745"}}