{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T17:11:53Z","timestamp":1775149913126,"version":"3.50.1"},"reference-count":19,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2018,7,21]],"date-time":"2018-07-21T00:00:00Z","timestamp":1532131200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001691","name":"KAKENHI","doi-asserted-by":"crossref","award":["15K16046"],"award-info":[{"award-number":["15K16046"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2018,12,31]]},"abstract":"<jats:p>The authors compared two methods for annotating a corpus for the named entity (NE) recognition task using non-expert annotators: (i) revising the results of an existing NE recognizer and (ii) manually annotating the NEs completely. The annotation time, degree of agreement, and performance were evaluated based on the gold standard. Because there were two annotators for one text for each method, two performances were evaluated: the average performance of both annotators and the performance when at least one annotator is correct. The experiments reveal that semi-automatic annotation is faster, achieves better agreement, and performs better on average. However, they also indicate that sometimes, fully manual annotation should be used for some texts whose document types are substantially different from the training data document types. In addition, the machine learning experiments using semi-automatic and fully manually annotated corpora as training data indicate that the F-measures could be better for some texts when manual instead of semi-automatic annotation was used. Finally, experiments using the annotated corpora for training as additional corpora show that (i) the NE recognition performance does not always correspond to the performance of the NE tag annotation and (ii) the system trained with the manually annotated corpus outperforms the system trained with the semi-automatically annotated corpus with respect to newswires, even though the existing NE recognizer was mainly trained with newswires.<\/jats:p>","DOI":"10.1145\/3218820","type":"journal-article","created":{"date-parts":[[2018,7,23]],"date-time":"2018-07-23T13:02:15Z","timestamp":1532350935000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Comparison of Methods to Annotate Named Entity Corpora"],"prefix":"10.1145","volume":"17","author":[{"given":"Kanako","family":"Komiya","sequence":"first","affiliation":[{"name":"Ibaraki University, Kanako, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masaya","family":"Suzuki","sequence":"additional","affiliation":[{"name":"Ibaraki University, Ibaraki, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tomoya","family":"Iwakura","sequence":"additional","affiliation":[{"name":"Fujitsu Laboratories, Kawasaki, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Minoru","family":"Sasaki","sequence":"additional","affiliation":[{"name":"Ibaraki University, Ibaraki, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hiroyuki","family":"Shinnou","sequence":"additional","affiliation":[{"name":"Ibaraki University, Ibaraki, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,7,21]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of 4th Linguistic Annotation Workshop, ACL","author":"Alex Bea","year":"2010"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/1699765.1699774"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of 4th Linguistic Annotation Workshop, ACL","author":"der Plas Lonneke Van","year":"2010"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of COLING","author":"Guillaume Bruno","year":"2016"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of PACLIC","author":"Hangyo Masatsugu","year":"2012"},{"key":"e_1_2_1_7_1","volume-title":"NLP2015 Error Analysis Workshop","author":"Hirata Ai","year":"2015"},{"key":"e_1_2_1_8_1","volume-title":"NLP2015 Error Analysis Workshop","author":"Ichihara Masaaki","year":"2015"},{"key":"e_1_2_1_9_1","volume-title":"NLP2015 Error Analysis Workshop","author":"Iwakura Tomoya","year":"2015"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-2706"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of COLING","author":"Kawahara Daisuke","year":"2014"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-1708"},{"key":"e_1_2_1_13_1","volume-title":"Retrieved","author":"Burnard Lou","year":"2010"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-1316"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-013-9261-0"},{"key":"e_1_2_1_16_1","unstructured":"Mitchell P. Marcus Mary Ann Marcinkiewicz and Beatrice Santorini. 1993. Building a large annotated corpus of english: The penn treebank. Computational Linguistics\u2014Special Issue on Using Large Corpora: II 19 (1993) 313--330.   Mitchell P. Marcus Mary Ann Marcinkiewicz and Beatrice Santorini. 1993. Building a large annotated corpus of english: The penn treebank. Computational Linguistics\u2014Special Issue on Using Large Corpora: II 19 (1993) 313--330."},{"key":"e_1_2_1_17_1","volume-title":"International Conference of the Pacific Association for Computational Linguistics. 10--17","author":"Sasada Tetsuro","year":"2015"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of International Joint Conference on Natural Language Processing. 607--612","author":"Sasano Ryohei","year":"2008"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the Language Resources and Evaluation Conference, No. 1019","author":"Sekine Satoshi","year":"2000"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 2008 Conference on Emprical Methods in Natural Language Processing (EMNLP\u201908)","author":"Snow Rion"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3218820","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3218820","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:07:05Z","timestamp":1750212425000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3218820"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,7,21]]},"references-count":19,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2018,12,31]]}},"alternative-id":["10.1145\/3218820"],"URL":"https:\/\/doi.org\/10.1145\/3218820","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"value":"2375-4699","type":"print"},{"value":"2375-4702","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,7,21]]},"assertion":[{"value":"2017-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-07-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}