{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T17:08:22Z","timestamp":1771520902023,"version":"3.50.1"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2005,6,1]],"date-time":"2005-06-01T00:00:00Z","timestamp":1117584000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGKDD Explor. Newsl."],"published-print":{"date-parts":[[2005,6]]},"abstract":"<jats:p>This paper presents a lexicalized HMM-based approach to Chinese named entity recognition (NER). To tackle the problem of unknown words, we unify unknown word identification and NER as a single tagging task on a sequence of known words. To do this, we first employ a known-word bigram-based model to segment a sentence into a sequence of known words, and then apply the uniformly lexicalized HMMs to assign each known word a proper hybrid tag that indicates its pattern in forming an entity and the category of the formed entity. Our system is able to integrate both the internal formation patterns and the surrounding contextual clues for NER under the framework of HMMs. As a result, the performance of the system can be improved without losing its efficiency in training and tagging. We have tested our system using different public corpora. The results show that lexicalized HMMs can substantially improve NER performance over standard HMMs. The results also indicate that character-based tagging (viz. the tagging based on pure single-character words) is comparable to and can even outperform the relevant known-word based tagging when a lexicalization technique is applied.<\/jats:p>","DOI":"10.1145\/1089815.1089819","type":"journal-article","created":{"date-parts":[[2007,1,17]],"date-time":"2007-01-17T18:32:02Z","timestamp":1169058722000},"page":"19-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":49,"title":["Chinese named entity recognition using lexicalized HMMs"],"prefix":"10.1145","volume":"7","author":[{"given":"Guohong","family":"Fu","sequence":"first","affiliation":[{"name":"The University of Hong Kong, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kang-Kwong","family":"Luke","sequence":"additional","affiliation":[{"name":"The University of Hong Kong, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2005,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007558221122"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073163"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014065"},{"key":"e_1_2_1_5_1","first-page":"4","article-title":"Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging","volume":"21","author":"Brill E","year":"1995","unstructured":"Brill , E . Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging . Computational Linguistics , 21 , 4 ( 1995 ), 543--565. Brill, E. Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics, 21, 4 (1995), 543--565.","journal-title":"Computational Linguistics"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.3115\/1072228.1072282"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073167"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324904003353"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.3115\/990820.990890"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/944790.944819"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.3115\/1072228.1072240"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119384.1119393"},{"key":"e_1_2_1_13_1","first-page":"2","article-title":"Chinese named entity recognition using role model","volume":"8","author":"Zhang H.-P.","year":"2003","unstructured":"Zhang , H.-P. , Liu , Q. , Yu , H.-K. , Cheng , Y.-Q. , and Bai , S . Chinese named entity recognition using role model . Computational Linguistics and Chinese Language Processing , 8 , 2 ( 2003 ), 29--60. Zhang, H.-P., Liu, Q., Yu, H.-K., Cheng, Y.-Q., and Bai, S. Chinese named entity recognition using role model. Computational Linguistics and Chinese Language Processing, 8, 2 (2003), 29--60.","journal-title":"Computational Linguistics and Chinese Language Processing"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30211-7_52"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30211-7_10"},{"key":"e_1_2_1_16_1","first-page":"2","article-title":"Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation","volume":"13","author":"Yu S.","year":"2003","unstructured":"Yu , S. , Duan , H. , Zhu , S. , Swen , B. , and Chang , B . Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation . Journal of Chinese Language and Computing , 13 , 2 ( 2003 ), 121--158. Yu, S., Duan, H., Zhu, S., Swen, B., and Chang, B. Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation. Journal of Chinese Language and Computing, 13, 2 (2003), 121--158.","journal-title":"Journal of Chinese Language and Computing"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30211-7_74"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119204"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119355.1119381"},{"key":"e_1_2_1_20_1","first-page":"1","article-title":"Chinese word segmentation as character tagging","volume":"8","author":"Xue N","year":"2003","unstructured":"Xue , N . Chinese word segmentation as character tagging . Computational Linguistics and Chinese Language Processing , 8 , 1 ( 2003 ), 29--48. Xue, N. Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing, 8, 1 (2003), 29--48.","journal-title":"Computational Linguistics and Chinese Language Processing"},{"key":"e_1_2_1_21_1","first-page":"2","article-title":"CDWS --- A written Chinese automatic word segmentation system","volume":"1","author":"Liang N","year":"1987","unstructured":"Liang , N . CDWS --- A written Chinese automatic word segmentation system . Journal of Chinese Information Processing , 1 , 2 ( 1987 ), 44--52. Liang, N. CDWS --- A written Chinese automatic word segmentation system. Journal of Chinese Information Processing, 1, 2 (1987), 44--52.","journal-title":"Journal of Chinese Information Processing"},{"key":"e_1_2_1_22_1","volume-title":"The Seventh Message Understanding Conference Proceedings (MUC-7), Washington, D.C., USA","author":"Yu S.","year":"1998","unstructured":"Yu , S. , Bai , S. , and Wu , P . Description of the Kent Ridge Digital Labs system used for MUC-7 . in The Seventh Message Understanding Conference Proceedings (MUC-7), Washington, D.C., USA , 1998 , Yu, S., Bai, S., and Wu, P. Description of the Kent Ridge Digital Labs system used for MUC-7. in The Seventh Message Understanding Conference Proceedings (MUC-7), Washington, D.C., USA, 1998,"},{"key":"e_1_2_1_23_1","volume-title":"USA","author":"Chen H.-H.","year":"1998","unstructured":"Chen , H.-H. , Ding , Y.-W. , Tsai , S.-C. , and Bian , G . -W. Description of the NTU system used for MET2. in the Seventh Message Understanding Conference Proceedings (MUC-7), Washington, D.C ., USA , 1998 , Chen, H.-H., Ding, Y.-W., Tsai, S.-C., and Bian, G.-W. Description of the NTU system used for MET2. in the Seventh Message Understanding Conference Proceedings (MUC-7), Washington, D.C., USA, 1998,"}],"container-title":["ACM SIGKDD Explorations Newsletter"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1089815.1089819","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1089815.1089819","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:08:16Z","timestamp":1750262896000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1089815.1089819"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,6]]},"references-count":22,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2005,6]]}},"alternative-id":["10.1145\/1089815.1089819"],"URL":"https:\/\/doi.org\/10.1145\/1089815.1089819","relation":{},"ISSN":["1931-0145","1931-0153"],"issn-type":[{"value":"1931-0145","type":"print"},{"value":"1931-0153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,6]]},"assertion":[{"value":"2005-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}