{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,18]],"date-time":"2024-10-18T04:29:22Z","timestamp":1729225762148,"version":"3.27.0"},"reference-count":0,"publisher":"IOS Press","isbn-type":[{"value":"9781643685489","type":"electronic"}],"license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,16]]},"abstract":"<jats:p>Existing Chinese spelling check (CSC) methods have limitations in correcting variable-length error characters, requiring the input and output to be the same length. They mainly focus on modelling Chinese characters\u2019 phonetic information and generating candidates for each position. In contrast, few approaches delve into the intricacies of splitting Chinese characters to address glyph errors and splitting variable-length corrections. We define the Chinese Splitting Error Correction (CSEC) task and develop CSEC datasets in news and social media domains to address this issue. We then propose Soft-Masked Multi-feature Error Correction (SoMu) model, which first generates semantic, phonetic, graphic, and unique Chinese Wubi embeddings, then integrates those features through selective gating fusion, followed by a soft-mask strategy to filter incorrect tokens and finally use transformer layers to predict the correct ones. This model effectively addresses both spelling and splitting errors. Extensive analysis shows that our model significantly improves character-splitting information modelling for CSEC. Our dataset is available at https:\/\/github.com\/Skywalker-Harrison\/SoMu.<\/jats:p>","DOI":"10.3233\/faia240952","type":"book-chapter","created":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T13:46:23Z","timestamp":1729172783000},"source":"Crossref","is-referenced-by-count":0,"title":["A Hybrid Approach towards Chinese Spelling and Splitting Error Correction"],"prefix":"10.3233","author":[{"given":"Junhong","family":"Liang","sequence":"first","affiliation":[{"name":"State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China"},{"name":"School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Junnan","family":"Zhu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China"},{"name":"School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Feifei","family":"Zhai","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China"},{"name":"Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China"}]},{"given":"Nanchang","family":"Cheng","sequence":"additional","affiliation":[{"name":"National Broadcast Media Language Resources Monitoring and Research Center, Communication University of China, Beijing, China, liangjunhong2022@ia.ac.cn"}]},{"given":"Chengqing","family":"Zong","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China"},{"name":"School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Yu","family":"Zhou","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China"},{"name":"Fanyu AI Laboratory, Zhongke Fanyu Technology Co., Ltd, Beijing, China"}]}],"member":"7437","container-title":["Frontiers in Artificial Intelligence and Applications","ECAI 2024"],"original-title":[],"link":[{"URL":"https:\/\/ebooks.iospress.nl\/pdf\/doi\/10.3233\/FAIA240952","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T13:46:23Z","timestamp":1729172783000},"score":1,"resource":{"primary":{"URL":"https:\/\/ebooks.iospress.nl\/doi\/10.3233\/FAIA240952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"ISBN":["9781643685489"],"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/faia240952","relation":{},"ISSN":["0922-6389","1879-8314"],"issn-type":[{"value":"0922-6389","type":"print"},{"value":"1879-8314","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,16]]}}}