{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,30]],"date-time":"2025-08-30T16:34:02Z","timestamp":1756571642361,"version":"3.40.5"},"reference-count":39,"publisher":"Cambridge University Press (CUP)","issue":"5","license":[{"start":{"date-parts":[[2019,11,22]],"date-time":"2019-11-22T00:00:00Z","timestamp":1574380800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Entities play an essential role in understanding textual documents, regardless of whether the documents are short, such as tweets, or long, such as news articles. In short textual documents, all entities mentioned are usually considered equally important because of the limited amount of information. In long textual documents, however, not all entities are equally important: some are salient and others are not. Traditional entity topic models (ETMs) focus on ways to incorporate entity information into topic models to better explain the generative process of documents. However, entities are usually treated equally, without considering whether they are salient or not. In this work, we propose a novel ETM, Salient Entity Topic Model, to take salient entities into consideration in the document generation process. In particular, we model salient entities as a source of topics used to generate words in documents, in addition to the topic distribution of documents used in traditional topic models. Qualitative and quantitative analysis is performed on the proposed model. Application to entity salience detection demonstrates the effectiveness of our model compared to the state-of-the-art topic model baselines.<\/jats:p>","DOI":"10.1017\/s1351324919000585","type":"journal-article","created":{"date-parts":[[2019,11,22]],"date-time":"2019-11-22T08:13:17Z","timestamp":1574410397000},"page":"531-549","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":4,"title":["It all starts with entities: A Salient entity topic model"],"prefix":"10.1017","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8784-0808","authenticated-orcid":false,"given":"Chuan","family":"Wu","sequence":"first","affiliation":[]},{"given":"Evangelos","family":"Kanoulas","sequence":"additional","affiliation":[]},{"given":"Maarten","family":"de Rijke","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2019,11,22]]},"reference":[{"key":"S1351324919000585_ref2","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553378"},{"key":"S1351324919000585_ref9","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505602"},{"key":"S1351324919000585_ref32","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-011-5272-5"},{"key":"S1351324919000585_ref7","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0307760101"},{"key":"S1351324919000585_ref30","doi-asserted-by":"publisher","DOI":"10.3115\/1699510.1699543"},{"key":"S1351324919000585_ref16","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020574"},{"key":"S1351324919000585_ref4","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2017.02.007"},{"key":"S1351324919000585_ref14","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-30217-6_31"},{"key":"S1351324919000585_ref19","unstructured":"Lau, J.H. , Grieser, K. , Newman, D. and Baldwin, T. (2011). Automatic labelling of topic models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (ACL), pp. 1536\u20131545."},{"key":"S1351324919000585_ref28","unstructured":"Ponza, M. , Ferragina, P. and Piccinno, F. (2018). Swat: A System for Detecting Salient Wikipedia Entities in Texts. arXiv preprint arXiv:1804.03580."},{"key":"S1351324919000585_ref17","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2012.107"},{"key":"S1351324919000585_ref37","doi-asserted-by":"publisher","DOI":"10.1145\/3209978.3209982"},{"key":"S1351324919000585_ref15","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2398669"},{"key":"S1351324919000585_ref27","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150487"},{"key":"S1351324919000585_ref34","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806486"},{"key":"S1351324919000585_ref29","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2017.05.018"},{"key":"S1351324919000585_ref36","doi-asserted-by":"crossref","unstructured":"Xie, R. , Liu, Z. , Jia, J. , Luan, H. and Sun, M. (2016). Representation learning of knowledge graphs with entity descriptions. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence and the Twenty-Eighth Innovative Applications of Artificial Intelligence Conference. Association for the Advancement of Artificial Intelligence, pp. 2659\u20132665.","DOI":"10.1609\/aaai.v30i1.10329"},{"key":"S1351324919000585_ref38","doi-asserted-by":"publisher","DOI":"10.3233\/IDA-160021"},{"key":"S1351324919000585_ref25","unstructured":"McCallum, A. (1999). Multi-label text classification with a mixture model trained by em. In AAAI workshop on Text Learning. Association for the Advancement of Artificial Intelligence, pp. 1\u20137."},{"key":"S1351324919000585_ref1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-56608-5_40"},{"key":"S1351324919000585_ref3","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-93935-3"},{"key":"S1351324919000585_ref35","doi-asserted-by":"publisher","DOI":"10.1145\/2872427.2883086"},{"key":"S1351324919000585_ref5","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324919000585_ref6","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-4040"},{"key":"S1351324919000585_ref8","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-1103"},{"key":"S1351324919000585_ref10","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0307752101"},{"key":"S1351324919000585_ref11","unstructured":"Han, X. and Sun, L. (2012). An entity-topic model for entity linking. In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (ACL), pp. 105\u2013115."},{"key":"S1351324919000585_ref12","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-18038-0_54"},{"key":"S1351324919000585_ref13","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433454"},{"key":"S1351324919000585_ref18","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557073"},{"key":"S1351324919000585_ref20","doi-asserted-by":"crossref","first-page":"67","DOI":"10.4000\/ijcol.392","article-title":"Entities as topic labels: combining entity linking and labeled lda to improve topic interpretability and evaluability","volume":"20","author":"Lauscher","year":"2016","journal-title":"IJCol-Italian Journal of Computational Linguistics"},{"key":"S1351324919000585_ref21","doi-asserted-by":"crossref","unstructured":"Levit, M. , Parthasarathy, S. , Chang, S. , Stolcke, A. and Dumoulin, B. (2014). Word-phrase-entity language models: Getting more mileage out of n-grams. In 15th Annual Conference of the International Speech Communication Association. International Speech Communication Association, pp. 666\u2013670.","DOI":"10.21437\/Interspeech.2014-168"},{"key":"S1351324919000585_ref22","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2015.04.012"},{"key":"S1351324919000585_ref23","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2014.07.053"},{"key":"S1351324919000585_ref24","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2018.04.071"},{"key":"S1351324919000585_ref39","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983898"},{"key":"S1351324919000585_ref26","unstructured":"McCallum, A. , Corrada-Emmanuel, A. and Wang, X. (2005). The author-recipient-topic model for topic and role discovery in social networks, with application to enron and academic email. In Proceedings of Workshop on Link Analysis, Counterterrorism and Security, p. 33."},{"key":"S1351324919000585_ref31","unstructured":"Rosen-Zvi, M. , Griffiths, T. , Steyvers, M. and Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (2004). AUAI Press, pp. 487\u2013494."},{"key":"S1351324919000585_ref33","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2487686"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324919000585","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,6]],"date-time":"2022-10-06T18:01:19Z","timestamp":1665079279000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324919000585\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,22]]},"references-count":39,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["S1351324919000585"],"URL":"https:\/\/doi.org\/10.1017\/s1351324919000585","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2019,11,22]]},"assertion":[{"value":"\u00a9 Cambridge University Press 2019","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}}]}}