{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T22:16:27Z","timestamp":1740176187851,"version":"3.37.3"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2019,11,15]],"date-time":"2019-11-15T00:00:00Z","timestamp":1573776000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,11,15]],"date-time":"2019-11-15T00:00:00Z","timestamp":1573776000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["17H01828","18K19841"],"award-info":[{"award-number":["17H01828","18K19841"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["18H03243"],"award-info":[{"award-number":["18H03243"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Sci. Eng."],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Knowledge of entity histories is often necessary for comprehensive understanding and characterization of entities. Yet, the analysis of an entity\u2019s history is often most meaningful when carried out in comparison with the histories of other entities. In this paper, we describe a novel task of<jats:italic>history-based entity categorization<\/jats:italic>and<jats:italic>comparison<\/jats:italic>. Based on a set of entity-related documents which are assumed as an input, we determine latent entity categories whose members share similar histories; hence, we are effectively grouping entities based on the correspondences in their historical developments. Next, we generate comparative timelines for each determined group allowing users to elucidate similarities and differences in the histories of entities. We evaluate our approach on several datasets of different entity types demonstrating its effectiveness against competitive baselines.<\/jats:p>","DOI":"10.1007\/s41019-019-00108-x","type":"journal-article","created":{"date-parts":[[2019,11,15]],"date-time":"2019-11-15T22:07:38Z","timestamp":1573855658000},"page":"336-351","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Discovering Latent Threads in Entity Histories"],"prefix":"10.1007","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5098-8593","authenticated-orcid":false,"given":"Yijun","family":"Duan","sequence":"first","affiliation":[]},{"given":"Adam","family":"Jatowt","sequence":"additional","affiliation":[]},{"given":"Katsumi","family":"Tanaka","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,11,15]]},"reference":[{"key":"108_CR1","unstructured":"Arora S, Liang Y, Ma T (2016) A simple but tough-to-beat baseline for sentence embeddings"},{"key":"108_CR2","unstructured":"Bairi RB, Carman M, Ramakrishnan G (2015) On the evolution of Wikipedia: dynamics of categories and articles. In: AAAI"},{"key":"108_CR3","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1162\/tacl_a_00189","volume":"2","author":"D Bamman","year":"2014","unstructured":"Bamman D, Smith NA (2014) Unsupervised discovery of biographical structure from text. TACL 2:363\u2013376","journal-title":"TACL"},{"key":"108_CR4","first-page":"33","volume-title":"Lecture Notes in Computer Science","author":"Roi Blanco","year":"2013","unstructured":"Blanco R, Cambazoglu BB, Mika P, Torzec N (2013) Entity recommendations in web search. In: ISWC. Springer, pp 33\u201348"},{"issue":"18","key":"108_CR5","doi-asserted-by":"publisher","first-page":"3825","DOI":"10.1016\/j.comnet.2012.10.007","volume":"56","author":"S Brin","year":"2012","unstructured":"Brin S, Page L (2012) Reprint of: the anatomy of a large-scale hypertextual web search engine. Comput Netw 56(18):3825\u20133833","journal-title":"Comput Netw"},{"key":"108_CR6","unstructured":"Brooks LR (1978) Nonanalytic concept formation and memory for instances. In Rosch E, Lloyd B (eds) Cognition and categorization. Lawrence Elbaum Associates, pp 3\u2013170"},{"key":"108_CR7","doi-asserted-by":"crossref","unstructured":"Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR. ACM, pp 335\u2013336","DOI":"10.1145\/290941.291025"},{"key":"108_CR8","doi-asserted-by":"crossref","unstructured":"Duan Y, Jatowt A, Tanaka K (2017) Discovering typical histories of entities by multi-timeline summarization. In: Proceedings of the 28th ACM conference on hypertext and social media. ACM, pp 105\u2013114","DOI":"10.1145\/3078714.3078725"},{"key":"108_CR9","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1613\/jair.1523","volume":"22","author":"G Erkan","year":"2004","unstructured":"Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457\u2013479","journal-title":"J Artif Intell Res"},{"issue":"5814","key":"108_CR10","doi-asserted-by":"publisher","first-page":"972","DOI":"10.1126\/science.1136800","volume":"315","author":"BJ Frey","year":"2007","unstructured":"Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972\u2013976","journal-title":"Science"},{"key":"108_CR11","unstructured":"Gillenwater J, Kulesza A, Taskar B (2012) Discovering diverse and salient threads in document collections. In: EMNLP. Association for Computational Linguistics, pp 710\u2013720"},{"key":"108_CR12","doi-asserted-by":"crossref","unstructured":"Gunaratna K, Thirunarayan K, Sheth AP (2015) Faces: diversity-aware entity summarization using incremental hierarchical conceptual clustering. In: AAAI, pp 116\u2013122","DOI":"10.1609\/aaai.v29i1.9180"},{"key":"108_CR13","doi-asserted-by":"crossref","unstructured":"Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: NAACL. Association for Computational Linguistics, pp 362\u2013370","DOI":"10.3115\/1620754.1620807"},{"key":"108_CR14","unstructured":"He L, Li W, Zhuge H (2016) Exploring differential topic models for comparative summarization of scientific papers. In: COLING, pp 1028\u20131038"},{"issue":"4","key":"108_CR15","doi-asserted-by":"publisher","first-page":"378","DOI":"10.3758\/BF03198278","volume":"8","author":"DL Hintzman","year":"1980","unstructured":"Hintzman DL, Ludlam G (1980) Differential forgetting of prototypes and old instances: simulation by an exemplar-based classification model. Mem Cognit 8(4):378\u2013382","journal-title":"Mem Cognit"},{"key":"108_CR16","doi-asserted-by":"publisher","unstructured":"Jatowt A, Au\u00a0Yeung CM, Tanaka K (2013) Estimating document focus time. In: Proceedings of the 22nd ACM international conference on information and knowledge management, CIKM \u201913. ACM, New York, pp 2273\u20132278. https:\/\/doi.org\/10.1145\/2505515.2505655","DOI":"10.1145\/2505515.2505655"},{"issue":"2","key":"108_CR17","doi-asserted-by":"publisher","first-page":"498","DOI":"10.1109\/18.910572","volume":"47","author":"FR Kschischang","year":"2001","unstructured":"Kschischang FR, Frey BJ, Loeliger HA et al (2001) Factor graphs and the sum-product algorithm. IEEE Trans Inf Theory 47(2):498\u2013519","journal-title":"IEEE Trans Inf Theory"},{"key":"108_CR18","unstructured":"Kusner M, Sun Y, Kolkin N, Weinberger K (2015) From word embeddings to document distances. In: International conference on machine learning, pp 957\u2013966"},{"issue":"1","key":"108_CR19","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1137\/S0036144503424786","volume":"47","author":"AN Langville","year":"2005","unstructured":"Langville AN, Meyer CD (2005) A survey of eigenvector methods for web information retrieval. SIAM Rev 47(1):135\u2013161","journal-title":"SIAM Rev"},{"key":"108_CR20","first-page":"387","volume":"2","author":"B Liu","year":"2002","unstructured":"Liu B, Lee WS, Yu PS, Li X (2002) Partially supervised classification of text documents. ICML 2:387\u2013394","journal-title":"ICML"},{"key":"108_CR21","doi-asserted-by":"crossref","unstructured":"Liu B, Dai Y, Li X, Lee WS, Yu PS (2003) Building text classifiers using positive and unlabeled examples. In: ICDM. IEEE, pp 179\u2013186","DOI":"10.1109\/ICDM.2003.1250918"},{"issue":"20","key":"108_CR22","doi-asserted-by":"publisher","first-page":"2023","DOI":"10.1016\/j.cub.2013.08.035","volume":"23","author":"ML Mack","year":"2013","unstructured":"Mack ML, Preston AR, Love BC (2013) Decoding the brains algorithm for categorization from its neural implementation. Curr Biol 23(20):2023\u20132027","journal-title":"Curr Biol"},{"key":"108_CR23","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781"},{"issue":"6","key":"108_CR24","doi-asserted-by":"publisher","first-page":"919","DOI":"10.1016\/j.ipm.2003.10.006","volume":"40","author":"DR Radev","year":"2004","unstructured":"Radev DR, Jing H, Sty\u015b M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919\u2013938","journal-title":"Inf Process Manag"},{"key":"108_CR25","unstructured":"\u0158eh\u016f\u0159ek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks, ELRA, Valletta, Malta, pp 45\u201350. http:\/\/is.muni.cz\/publication\/884893\/en"},{"key":"108_CR26","doi-asserted-by":"crossref","unstructured":"Ren Z, de\u00a0Rijke M (2015) Summarizing contrastive themes via hierarchical non-parametric processes. In: SIGIR. ACM, pp 93\u2013102","DOI":"10.1145\/2766462.2767713"},{"issue":"3","key":"108_CR27","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1037\/0096-3445.104.3.192","volume":"104","author":"E Rosch","year":"1975","unstructured":"Rosch E (1975) Cognitive representations of semantic categories. J Exp Psychol Gen 104(3):192","journal-title":"J Exp Psychol Gen"},{"key":"108_CR28","doi-asserted-by":"crossref","unstructured":"Roth D, Yih Wt (2005) Integer linear programming inference for conditional random fields. In: ICML. ACM, pp 736\u2013743","DOI":"10.1145\/1102351.1102444"},{"key":"108_CR29","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","volume":"20","author":"PJ Rousseeuw","year":"1987","unstructured":"Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53\u201365","journal-title":"J Comput Appl Math"},{"key":"108_CR30","doi-asserted-by":"crossref","unstructured":"Sanner S, Guo S, Graepel T, Kharazmi S, Karimi S (2011) Diverse retrieval via greedy optimization of expected 1-call@ k in a latent subtopic relevance model. In: CIKM. ACM, pp 1977\u20131980","DOI":"10.1145\/2063576.2063869"},{"key":"108_CR31","unstructured":"Singhal A (2012) Introducing the knowledge graph: things, not strings. Official google blog"},{"key":"108_CR32","unstructured":"Steinberger J, Jezek K (2004) Using latent semantic analysis in text summarization and summary evaluation. In: ISIM, pp 93\u2013100"},{"key":"108_CR33","doi-asserted-by":"crossref","unstructured":"Thalhammer A, Lasierra N, Rettinger A (2016) Linksum: using link analysis to summarize entity data. In: ICWE. Springer, pp 244\u2013261","DOI":"10.1007\/978-3-319-38791-8_14"},{"key":"108_CR34","doi-asserted-by":"crossref","unstructured":"Tran TA, Niedere C, Kanhabua N, Gadiraju U, Anand A (2015) Balancing novelty and salience: adaptive learning to rank entities for timeline summarization of high-impact events. In: CIKM. ACM, pp 1201\u20131210","DOI":"10.1145\/2806416.2806486"},{"issue":"3","key":"108_CR35","first-page":"12","volume":"6","author":"D Wang","year":"2012","unstructured":"Wang D, Zhu S, Li T, Gong Y (2012) Comparative document summarization via discriminative sentence selection. TKDD 6(3):12","journal-title":"TKDD"},{"key":"108_CR36","doi-asserted-by":"crossref","unstructured":"Wang J, Zhu J (2009) Portfolio theory of information retrieval. In: SIGIR. ACM, pp 115\u2013122","DOI":"10.1145\/1571941.1571963"},{"issue":"12","key":"108_CR37","doi-asserted-by":"publisher","first-page":"2670","DOI":"10.1109\/TNNLS.2015.2495268","volume":"27","author":"Y Wang","year":"2016","unstructured":"Wang Y, Chen L (2016) K-meap: multiple exemplars affinity propagation with specified $$k$$ clusters. IEEE Trans Neural Netw Learn Syst 27(12):2670\u20132682","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"108_CR38","unstructured":"Woodsend K, Lapata M (2012) Multiple aspect summarization using integer linear programming. In: EMNLP. Association for Computational Linguistics, pp 233\u2013243"},{"key":"108_CR39","doi-asserted-by":"crossref","unstructured":"Xiao J, Wang J, Tan P, Quan L (2007) Joint affinity propagation for multiple view segmentation. In: ICCV. IEEE, pp 1\u20137","DOI":"10.1109\/ICCV.2007.4408928"},{"key":"108_CR40","doi-asserted-by":"crossref","unstructured":"Yan R, Wan X, Otterbacher J, Kong L, Li X, Zhang Y (2011) Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: SIGIR. ACM, pp 745\u2013754","DOI":"10.1145\/2009916.2010016"},{"key":"108_CR41","doi-asserted-by":"crossref","unstructured":"Yu H, Han J, Chang KCC (2002) Pebl: positive example based learning for web page classification using SVM. In: SIGKDD. ACM, pp 239\u2013248","DOI":"10.1145\/775047.775083"},{"key":"108_CR42","unstructured":"Yu HT, Jatowt A, Blanco R, Joho H, Jose J, Chen L, Yuan F (2017) A concise integer linear programming formulation for implicit search result diversification. In: WSDM. ACM, pp 191\u2013200"},{"key":"108_CR43","first-page":"305","volume":"7224","author":"G Zuccon","year":"2012","unstructured":"Zuccon G, Azzopardi L, Zhang D, Wang J (2012) Top-k retrieval using facility location analysis. ECIR 7224:305\u2013316","journal-title":"ECIR"}],"container-title":["Data Science and Engineering"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-019-00108-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s41019-019-00108-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-019-00108-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,5]],"date-time":"2022-10-05T18:58:51Z","timestamp":1664996331000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s41019-019-00108-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,15]]},"references-count":43,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["108"],"URL":"https:\/\/doi.org\/10.1007\/s41019-019-00108-x","relation":{},"ISSN":["2364-1185","2364-1541"],"issn-type":[{"type":"print","value":"2364-1185"},{"type":"electronic","value":"2364-1541"}],"subject":[],"published":{"date-parts":[[2019,11,15]]},"assertion":[{"value":"20 August 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 October 2019","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 October 2019","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 November 2019","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}