{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T16:01:16Z","timestamp":1765382476954,"version":"3.30.2"},"reference-count":40,"publisher":"Cambridge University Press (CUP)","issue":"6","license":[{"start":{"date-parts":[[2024,1,25]],"date-time":"2024-01-25T00:00:00Z","timestamp":1706140800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2024,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Coreference resolution is the task of identifying and clustering mentions that refer to the same entity in a document. Based on state-of-the-art deep learning approaches, end-to-end coreference resolution considers all spans as candidate mentions and tackles mention detection and coreference resolution simultaneously. Recently, researchers have attempted to incorporate document-level context using higher-order inference (HOI) to improve end-to-end coreference resolution. However, HOI methods have been shown to have marginal or even negative impact on coreference resolution. In this paper, we reveal the reasons for the negative impact of HOI coreference resolution. Contextualized representations (e.g., those produced by BERT) for building span embeddings have been shown to be highly anisotropic. We show that HOI actually increases and thus worsens the anisotropy of span embeddings and makes it difficult to distinguish between related but distinct entities (e.g., <jats:italic>pilots<\/jats:italic> and <jats:italic>flight attendants<\/jats:italic>). Instead of using HOI, we propose two methods, Less-Anisotropic Internal Representations (LAIR) and Data Augmentation with Document Synthesis and Mention Swap (DSMS), to learn less-anisotropic span embeddings for coreference resolution. LAIR uses a linear aggregation of the first layer and the topmost layer of contextualized embeddings. DSMS generates more diversified examples of related but distinct entities by synthesizing documents and by mention swapping. Our experiments show that less-anisotropic span embeddings improve the performance significantly (+2.8 F1 gain on the OntoNotes benchmark) reaching new state-of-the-art performance on the GAP dataset.<\/jats:p>","DOI":"10.1017\/s1351324924000019","type":"journal-article","created":{"date-parts":[[2024,1,25]],"date-time":"2024-01-25T07:40:11Z","timestamp":1706168411000},"page":"1301-1322","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":3,"title":["Anisotropic span embeddings and the negative impact of higher-order inference for coreference resolution: An empirical analysis"],"prefix":"10.1017","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5776-9177","authenticated-orcid":false,"given":"Feng","family":"Hou","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruili","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"See-Kiong","family":"Ng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fangyi","family":"Zhu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Witbrock","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Steven F.","family":"Cahan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lily","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoyun","family":"Jia","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"56","published-online":{"date-parts":[[2024,1,25]]},"reference":[{"key":"S1351324924000019_ref11","unstructured":"Devlin, J. , Chang, M.-W. , Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers), Minneapolis, Minnesota: Association for Computational Linguistics, vol. 1, pp. 4171\u20134186 ."},{"key":"S1351324924000019_ref19","doi-asserted-by":"crossref","unstructured":"Lee, K. , He, L. and Zettlemoyer, L. (2018). Higher-order coreference resolution with coarse-to-fine inference, (Short Papers), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Short Papers), New Orleans, Louisiana: Association for Computational Linguistics, vol. 2, pp. 687\u2013692 .","DOI":"10.18653\/v1\/N18-2108"},{"key":"S1351324924000019_ref8","unstructured":"Clark, K. , Luong, M.-T. , Le, Q. V. and Manning, C. D. (2020). ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the 8th International Conference on Learning Representations (ICLR)."},{"key":"S1351324924000019_ref29","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1017\/S135132491000029X","article-title":"Blanc: implementing the rand index for coreference evaluation","volume":"17","author":"Recasens","year":"2011","journal-title":"Natural Language Engineering"},{"key":"S1351324924000019_ref27","unstructured":"Pradhan, S. , Moschitti, A. , Xue, N. , Uryupina, O. and Zhang, Y. (2012). Conll-2012 shared task: Modeling multilingual unrestricted coreference in ontonotes. In Joint Conference on EMNLP and CoNLL-Shared Task, Association for Computational Linguistics, pp. 1\u201340."},{"key":"S1351324924000019_ref20","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1017\/S1351324917000109","article-title":"A scaffolding approach to coreference resolution integrating statistical and rule-based models","volume":"23","author":"Lee","year":"2017","journal-title":"Natural Language Engineering"},{"key":"S1351324924000019_ref15","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1162\/tacl_a_00300","article-title":"Spanbert: improving pre-training by representing and predicting spans","volume":"8","author":"Joshi","year":"2020","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"S1351324924000019_ref2","doi-asserted-by":"crossref","unstructured":"Agirre, E. , Banea, C. , Cardie, C. , Cer, D. , Diab, M. , Gonzalez-Agirre, A. , Guo, W. , Mihalcea, R. , Rigau, G. and Wiebe, J. (2014). SemEval-2014 task 10: Multilingual semantic textual similarity. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland: Association for Computational Linguistics, pp. 81\u201391.","DOI":"10.3115\/v1\/S14-2010"},{"key":"S1351324924000019_ref14","doi-asserted-by":"crossref","unstructured":"Hou, F. , Wang, R. , He, J. and Zhou, Y. (2020). Improving entity linking through semantic reinforced entity embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online. Association for Computational Linguistics, pp. 6843\u20136848.","DOI":"10.18653\/v1\/2020.acl-main.612"},{"key":"S1351324924000019_ref4","unstructured":"Agirre, E. , Cer, D. , Diab, M. , Gonzalez-Agirre, A. and Guo, W. (2013). *SEM. 2013 shared task: Semantic textual similarity. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity, Atlanta, Georgia, USA: Association for Computational Linguistics, pp. 32\u201343."},{"key":"S1351324924000019_ref23","unstructured":"Mu, J. , Bhat, S. and Viswanath, P. (2018). All-but-the-top: Simple and effective postprocessing for word representations. In Proceedings of the 6th International Conference on Learning Representations (ICLR)."},{"key":"S1351324924000019_ref16","doi-asserted-by":"crossref","unstructured":"Joshi, M. , Levy, O. , Zettlemoyer, L. and Weld, D. (2019). BERT for coreference resolution: Baselines and analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics, pp. 5803\u20135808.","DOI":"10.18653\/v1\/D19-1588"},{"key":"S1351324924000019_ref30","unstructured":"Santos, C. D. and Zadrozny, B. (2014). Learning character-level representations for part-of-speech tagging. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1818\u20131826."},{"key":"S1351324924000019_ref33","doi-asserted-by":"crossref","unstructured":"Vilain, M. , Burger, J. , Aberdeen, J. , Connolly, D. and Hirschman, L. (1995). A model-theoretic coreference scoring scheme. In Proceedings of the 6th Conference on Message Understanding, Association for Computational Linguistics, pp. 45\u201352.","DOI":"10.3115\/1072399.1072405"},{"key":"S1351324924000019_ref32","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"S1351324924000019_ref6","unstructured":"Arora, S. , Liang, Y. and Ma, T. (2017). A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of International Conference on Learning Representations."},{"key":"S1351324924000019_ref28","unstructured":"Raghunathan, K. , Lee, H. , Rangarajan, S. , Chambers, N. , Surdeanu, M. , Jurafsky, D. and Manning, C. (2010). A multi-pass sieve for coreference resolution. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA: Association for Computational Linguistics, pp. 492\u2013501."},{"key":"S1351324924000019_ref34","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1162\/tacl_a_00240","article-title":"Mind the GAP: a balanced corpus of gendered ambiguous pronouns","volume":"6","author":"Webster","year":"2018","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"S1351324924000019_ref26","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1017\/S135132490300319X","article-title":"Evaluation-driven design of a robust coreference resolution system","volume":"9","author":"Popescu-Belis","year":"2003","journal-title":"Natural Language Engineering"},{"key":"S1351324924000019_ref35","doi-asserted-by":"crossref","unstructured":"Wiseman, S. , Rush, A. M. and Shieber, S. M. (2016). Learning global features for coreference resolution. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California: Association for Computational Linguistics, pp. 994\u20131004.","DOI":"10.18653\/v1\/N16-1114"},{"key":"S1351324924000019_ref24","first-page":"1532","volume-title":"Empirical Methods in Natural Language Processing (EMNLP)","author":"Pennington","year":"2014"},{"key":"S1351324924000019_ref21","doi-asserted-by":"crossref","unstructured":"Luo, X. (2005). On coreference resolution performance metrics. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 25\u201332.","DOI":"10.3115\/1220575.1220579"},{"key":"S1351324924000019_ref9","doi-asserted-by":"crossref","unstructured":"Clark, K. and Manning, C. D. (2016a). Deep reinforcement learning for mention-ranking coreference models. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas: Association for Computational Linguistics, pp. 2256\u20132262.","DOI":"10.18653\/v1\/D16-1245"},{"key":"S1351324924000019_ref12","unstructured":"Durrett, G. and Klein, D. (2013). Easy victories and uphill battles in coreference resolution. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA: Association for Computational Linguistics, pp. 1971\u20131982."},{"key":"S1351324924000019_ref37","doi-asserted-by":"crossref","unstructured":"Wu, W. , Wang, F. , Yuan, A. , Wu, F. and Li, J. (2020). CorefQA: Coreference resolution as query-based span prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online. Association for Computational Linguistics, pp. 6953\u20136963.","DOI":"10.18653\/v1\/2020.acl-main.622"},{"key":"S1351324924000019_ref18","doi-asserted-by":"crossref","unstructured":"Lee, K. , He, L. , Lewis, M. and Zettlemoyer, L. (2017b). End-to-end neural coreference resolution. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark: Association for Computational Linguistics, pp. 188\u2013197.","DOI":"10.18653\/v1\/D17-1018"},{"key":"S1351324924000019_ref39","doi-asserted-by":"crossref","unstructured":"Yaghoobzadeh, Y. , Kann, K. , Hazen, T. J. , Agirre, E. and Sch\u00fctze, H. (2019). Probing for semantic classes: Diagnosing the meaning content of word embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy: Association for Computational Linguistics, pp. 5740\u20135753.","DOI":"10.18653\/v1\/P19-1574"},{"key":"S1351324924000019_ref10","doi-asserted-by":"crossref","unstructured":"Clark, K. and Manning, C. D. (2016b). Improving coreference resolution by learning entity-level distributed representations. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany: Association for Computational Linguistics, pp. 643\u2013653.","DOI":"10.18653\/v1\/P16-1061"},{"key":"S1351324924000019_ref7","unstructured":"Bagga, A. and Baldwin, B. (1998). Algorithms for scoring coreference chains. In The First International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference, Granada, vol. 1, pp. 563\u2013566."},{"key":"S1351324924000019_ref5","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1162\/tacl_a_00034","article-title":"Linear algebraic structure of word senses, with applications to polysemy","volume":"6","author":"Arora","year":"2018","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"S1351324924000019_ref31","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.inffus.2020.01.010","article-title":"Anaphora and coreference resolution: a review","volume":"59","author":"Sukthanker","year":"2020","journal-title":"Information Fusion"},{"key":"S1351324924000019_ref22","unstructured":"Mu, J. , Bhat, S. and Viswanath, P. (2017). Geometry of polysemy. In Proceedings of the 5th International Conference on Learning Representations (ICLR)."},{"key":"S1351324924000019_ref3","unstructured":"Agirre, E. , Cer, D. , Diab, M. and Gonzalez-Agirre, A. (2012). SemEval-2012 task 6: A pilot on semantic textual similarity, *SEM 2012: The First Joint Conference on Lexical and Computational Semantics \u2013 Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Montr\u00e9al, Canada: Association for Computational Linguistics, pp. 385\u2013393 ."},{"key":"S1351324924000019_ref38","doi-asserted-by":"crossref","unstructured":"Xu, L. and Choi, J. D. (2020). Revealing the myth of higher-order inference in coreference resolution. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online. Association for Computational Linguistics, pp. 8527\u20138533.","DOI":"10.18653\/v1\/2020.emnlp-main.686"},{"key":"S1351324924000019_ref1","doi-asserted-by":"crossref","unstructured":"Agirre, E. , Banea, C. , Cardie, C. , Cer, D. , Diab, M. , Gonzalez-Agirre, A. , Guo, W. , Lopez-Gazpio, I. , Maritxalar, M. , Mihalcea, R. , Rigau, G. , Uria, L. and Wiebe, J. (2015). SemEval-2015 task 2: Semantic textual similarity, English, Spanish and pilot on interpretability. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado: Association for Computational Linguistics, pp. 252\u2013263.","DOI":"10.18653\/v1\/S15-2045"},{"key":"S1351324924000019_ref36","unstructured":"Wu, Y. , Schuster, M. , Chen, Z. , Le, Q. V. , Norouzi, M. , Macherey, W. , Krikun, M. , Cao, Y. , Gao, Q. , Macherey, K. and et\u00a0al. (2016). Google\u2019s neural machine translation system: bridging the gap between human and machine translation. cs.CL, 1\u201323, arXiv preprint arXiv: 1609.08144."},{"key":"S1351324924000019_ref25","doi-asserted-by":"crossref","unstructured":"Peters, M. , Neumann, M. , Iyyer, M. , Gardner, M. , Clark, C. , Lee, K. and Zettlemoyer, L. (2018). Deep contextualized word representations, (Long Papers), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long Papers), New Orleans, Louisiana: Association for Computational Linguistics, vol 1, pp. 2227\u20132237 .","DOI":"10.18653\/v1\/N18-1202"},{"key":"S1351324924000019_ref17","doi-asserted-by":"crossref","unstructured":"Kantor, B. and Globerson, A. (2019). Coreference resolution with entity equalization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy: Association for Computational Linguistics, pp. 673\u2013677.","DOI":"10.18653\/v1\/P19-1066"},{"key":"S1351324924000019_ref13","doi-asserted-by":"crossref","unstructured":"Ethayarajh, K. (2019). How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics, pp. 55\u201365.","DOI":"10.18653\/v1\/D19-1006"},{"key":"S1351324924000019_ref40","doi-asserted-by":"crossref","unstructured":"Zhang, R. , Nogueira dos Santos, C. , Yasunaga, M. , Xiang, B. and Radev, D. (2018). Neural coreference resolution with deep biaffine attention by joint mention detection and mention clustering. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Melbourne, Australia: Association for Computational Linguistics, pp. 102\u2013107.","DOI":"10.18653\/v1\/P18-2017"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324924000019","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,12]],"date-time":"2024-12-12T11:12:32Z","timestamp":1734001952000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324924000019\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,25]]},"references-count":40,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,11]]}},"alternative-id":["S1351324924000019"],"URL":"https:\/\/doi.org\/10.1017\/s1351324924000019","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2024,1,25]]},"assertion":[{"value":"\u00a9 The Author(s), 2024. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http:\/\/creativecommons.org\/licenses\/by\/4.0\/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.","name":"license","label":"License","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}