{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T04:04:13Z","timestamp":1777089853292,"version":"3.51.4"},"reference-count":45,"publisher":"MIT Press - Journals","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Transactions of the Association for Computational Linguistics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:p> Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts. The first step of (X)EL is candidate generation, which retrieves a list of plausible candidate entities from the target-language KB for each mention. Approaches based on resources from Wikipedia have proven successful in the realm of relatively high-resource languages, but these do not extend well to low-resource languages with few, if any, Wikipedia pages. Recently, transfer learning methods have been shown to reduce the demand for resources in the low-resource languages by utilizing resources in closely related languages, but the performance still lags far behind their high-resource counterparts. In this paper, we first assess the problems faced by current entity candidate generation methods for low-resource XEL, then propose three improvements that (1) reduce the disconnect between entity mentions and KB entries, and (2) improve the robustness of the model to low-resource scenarios. The methods are simple, but effective: We experiment with our approach on seven XEL datasets and find that they yield an average gain of 16.9% in Top-30 gold candidate recall, compared with state-of-the-art baselines. Our improved model also yields an average gain of 7.9% in in-KB accuracy of end-to-end XEL. <jats:sup>1<\/jats:sup> <\/jats:p>","DOI":"10.1162\/tacl_a_00303","type":"journal-article","created":{"date-parts":[[2020,3,31]],"date-time":"2020-03-31T14:54:52Z","timestamp":1585666492000},"page":"109-124","source":"Crossref","is-referenced-by-count":12,"title":["Improving Candidate Generation for Low-resource Cross-lingual Entity Linking"],"prefix":"10.1162","volume":"8","author":[{"given":"Shuyan","family":"Zhou","sequence":"first","affiliation":[{"name":"Language Technologies Institute, Carnegie Mellon University."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shruti","family":"Rijhwani","sequence":"additional","affiliation":[{"name":"Language Technologies Institute, Carnegie Mellon University."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"John","family":"Wieting","sequence":"additional","affiliation":[{"name":"Language Technologies Institute, Carnegie Mellon University."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jaime","family":"Carbonell","sequence":"additional","affiliation":[{"name":"Language Technologies Institute, Carnegie Mellon University."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Graham","family":"Neubig","sequence":"additional","affiliation":[{"name":"Language Technologies Institute, Carnegie Mellon University."}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"key":"bib1","first-page":"789","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics","author":"Artetxe Mikel","year":"2018"},{"key":"bib2","doi-asserted-by":"crossref","first-page":"20","DOI":"10.18653\/v1\/W19-2804","volume-title":"Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference","author":"Blissett Kevin","year":"2019"},{"key":"bib3","volume-title":"11th Conference of the European Chapter of the Association for Computational Linguistics","author":"Bunescu Razvan","year":"2006"},{"key":"bib4","doi-asserted-by":"crossref","first-page":"261","DOI":"10.18653\/v1\/D18-1024","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Chen Xilun","year":"2018"},{"key":"bib5","author":"Conneau Alexis","year":"2017","journal-title":"International Conference on Learning Representations"},{"key":"bib6","first-page":"748","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Cotterell Ryan","year":"2017"},{"key":"bib7","volume-title":"Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning","author":"Cucerzan Silviu","year":"2007"},{"key":"bib8","first-page":"3079","volume-title":"Advances in Neural Information Processing Systems","author":"Dai Andrew M.","year":"2015"},{"key":"bib9","first-page":"277","volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics","author":"Dredze Mark","year":"2010"},{"key":"bib10","first-page":"2619","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Ganea Octavian-Eugen","year":"2017"},{"key":"bib11","first-page":"621","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics","author":"Globerson Amir","year":"2016"},{"key":"bib12","first-page":"771","volume-title":"Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics","author":"Haghighi Aria","year":"2008"},{"key":"bib13","first-page":"782","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Hoffart Johannes","year":"2011"},{"key":"bib14","volume-title":"Text Analysis Conference","author":"Ji Heng","year":"2015"},{"key":"bib15","author":"Johnson Jeff","year":"2019","journal-title":"IEEE Transactions on Big Data"},{"key":"bib16","first-page":"599","volume":"24","author":"Knight Kevin","year":"1998","journal-title":"Computational Linguistics"},{"key":"bib17","first-page":"159","volume-title":"Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics","author":"Li Haizhou","year":"2004"},{"key":"bib18","volume-title":"The 57th Annual Meeting of the Association for Computational Linguistics","author":"Lin Yu-Hsiang","year":"2019"},{"key":"bib19","first-page":"255","volume-title":"Proceedings of 5th International Joint Conference on Natural Language Processing","author":"McNamee Paul","year":"2011"},{"key":"bib20","first-page":"6443","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing","author":"Min Bonan","year":"2019"},{"key":"bib21","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation","author":"Mortensen David R.","year":"2018"},{"key":"bib22","first-page":"1946","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics","author":"Pan Xiaoman","year":"2017"},{"key":"bib23","first-page":"1844","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Radhakrishnan Priya","year":"2018"},{"key":"bib24","doi-asserted-by":"crossref","first-page":"151","DOI":"10.18653\/v1\/P19-1015","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Rahimi Afshin","year":"2019"},{"key":"bib25","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139058452","volume-title":"Mining of Massive Datasets","author":"Rajaraman Anand","year":"2011"},{"key":"bib26","volume-title":"Thirty-Third AAAI Conference on Artificial Intelligence (AAAI)","author":"Rijhwani Shruti","year":"2019"},{"key":"bib27","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1109\/TKDE.2014.2327028","author":"Shen Wei","year":"2015","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"bib28","first-page":"2255","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics","author":"Sil Avirup","year":"2016"},{"key":"bib29","volume-title":"Thirty-Second AAAI Conference on Artificial Intelligence","author":"Sil Avirup","year":"2018"},{"key":"bib30","first-page":"3168","volume-title":"Proceedings of the Eighth International Conference on Language Resources and Evaluation","author":"Spitkovsky Valentin I.","year":"2012"},{"key":"bib31","first-page":"3273","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation","author":"Strassel Stephanie","year":"2016"},{"key":"bib32","first-page":"32","volume-title":"SMERP@ ECIR","author":"Strassel Stephanie M.","year":"2017"},{"key":"bib33","first-page":"477","volume-title":"Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"T\u00e4ckstr\u00f6m Oscar","year":"2012"},{"key":"bib34","first-page":"589","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Tsai Chen-Tse","year":"2016"},{"key":"bib35","volume-title":"Thirty-Second AAAI Conference on Artificial Intelligence","author":"Tsai Chen-Tse","year":"2018"},{"key":"bib36","doi-asserted-by":"crossref","first-page":"2486","DOI":"10.18653\/v1\/D18-1270","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Upadhyay Shyam","year":"2018"},{"key":"bib37","doi-asserted-by":"crossref","first-page":"501","DOI":"10.18653\/v1\/D18-1046","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Upadhyay Shyam","year":"2018"},{"key":"bib38","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani Ashish","year":"2017"},{"key":"bib39","doi-asserted-by":"crossref","first-page":"15","DOI":"10.18653\/v1\/W16-1403","volume-title":"Proceedings of TextGraphs-10: the Workshop on Graph-based Methods for Natural Language Processing","author":"Veyseh Amir Pouran Ben","year":"2016"},{"key":"bib40","volume-title":"Proceedings of International Conference on Learning Representations","author":"Wieting John","year":"2016"},{"key":"bib41","doi-asserted-by":"crossref","first-page":"1504","DOI":"10.18653\/v1\/D16-1157","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Wieting John","year":"2016"},{"key":"bib42","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00069"},{"key":"bib43","first-page":"649","volume-title":"Advances in Neural Information Processing Systems","author":"Zhang Xiang","year":"2015"},{"key":"bib44","first-page":"1307","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Zhang Yuan","year":"2016"},{"key":"bib45","volume-title":"Workshop on Deep Learning for Low-resource NLP","author":"Zhou Shuyan","year":"2019"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00303","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:39:33Z","timestamp":1615585173000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/43544"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12]]},"references-count":45,"alternative-id":["10.1162\/tacl_a_00303"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00303","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12]]}}}