{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T21:57:33Z","timestamp":1757541453258,"version":"3.40.5"},"reference-count":82,"publisher":"Cambridge University Press (CUP)","issue":"1","license":[{"start":{"date-parts":[[2019,4,16]],"date-time":"2019-04-16T00:00:00Z","timestamp":1555372800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2020,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this paper, we address query-based summarization of discussion threads. New users can profit from the information shared in the forum, Please check if the inserted city and country names in the affiliations are correct. if they can find back the previously posted information. However, discussion threads on a single topic can easily comprise dozens or hundreds of individual posts. Our aim is to summarize forum threads given real web search queries. We created a data set with search queries from a discussion forum\u2019s search engine log and the discussion threads that were clicked by the user who entered the query. For 120 thread\u2013query combinations, a reference summary was made by five different human raters. We compared two methods for automatic summarization of the threads: a query-independent method based on post features, and Maximum Marginal Relevance (MMR), a method that takes the query into account. We also compared four different word embeddings representations as alternative for standard word vectors in extractive summarization. We find (1) that the agreement between human summarizers does not improve when a query is provided that: (2) the query-independent post features as well as a centroid-based baseline outperform MMR by a large margin; (3) combining the post features with query similarity gives a small improvement over the use of post features alone; and (4) for the word embeddings, a match in domain appears to be more important than corpus size and dimensionality. However, the differences between the models were not reflected by differences in quality of the summaries created with help of these models. We conclude that query-based summarization with web queries is challenging because the queries are short, and a click on a result is not a direct indicator for the relevance of the result.<\/jats:p>","DOI":"10.1017\/s1351324919000123","type":"journal-article","created":{"date-parts":[[2019,4,16]],"date-time":"2019-04-16T06:39:15Z","timestamp":1555396755000},"page":"3-29","source":"Crossref","is-referenced-by-count":7,"title":["Query-based summarization of discussion threads"],"prefix":"10.1017","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9609-9505","authenticated-orcid":false,"given":"Suzan","family":"Verberne","sequence":"first","affiliation":[]},{"given":"Emiel","family":"Krahmer","sequence":"additional","affiliation":[]},{"given":"Sander","family":"Wubben","sequence":"additional","affiliation":[]},{"given":"Antal","family":"van den Bosch","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2019,4,16]]},"reference":[{"key":"S1351324919000123_ref79","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2012.2229984"},{"key":"S1351324919000123_ref78","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2007.09.007"},{"key":"S1351324919000123_ref76","doi-asserted-by":"publisher","DOI":"10.1037\/a0037489"},{"first-page":"125","year":"2007","author":"Weimer","key":"S1351324919000123_ref75"},{"key":"S1351324919000123_ref72","doi-asserted-by":"publisher","DOI":"10.1002\/asi.22948"},{"key":"S1351324919000123_ref70","doi-asserted-by":"publisher","DOI":"10.2196\/jmir.992"},{"key":"S1351324919000123_ref71","doi-asserted-by":"publisher","DOI":"10.1016\/j.pec.2008.07.044"},{"year":"2016","author":"Tulkens","key":"S1351324919000123_ref68"},{"first-page":"2","year":"1998","author":"Tombros","key":"S1351324919000123_ref65"},{"key":"S1351324919000123_ref64","doi-asserted-by":"publisher","DOI":"10.1017\/S135132491000001X"},{"first-page":"224","year":"2012","author":"Sipos","key":"S1351324919000123_ref61"},{"key":"S1351324919000123_ref80","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2016.2628402"},{"first-page":"879","year":"2011","author":"Ren","key":"S1351324919000123_ref58"},{"first-page":"375","year":"2003","author":"Radev","key":"S1351324919000123_ref57"},{"first-page":"470","year":"2008","author":"Penn","key":"S1351324919000123_ref54"},{"key":"S1351324919000123_ref52","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178974"},{"key":"S1351324919000123_ref49","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-3223-4_3"},{"first-page":"40","year":"2008","author":"Metzler","key":"S1351324919000123_ref42"},{"first-page":"495","year":"2000","author":"Lin","key":"S1351324919000123_ref37"},{"first-page":"1197","year":"2014","author":"Li","key":"S1351324919000123_ref35"},{"first-page":"68","year":"1995","author":"Kupiec","key":"S1351324919000123_ref32"},{"key":"S1351324919000123_ref29","first-page":"1120","article-title":"Using graded relevance assessments in IR evaluation","volume":"53","author":"Kek\u00e4l\u00e4inen","year":"2002","journal-title":"Journal of the Association for Information Science and Technology"},{"first-page":"21","year":"2015","author":"Kabadjov","key":"S1351324919000123_ref28"},{"key":"S1351324919000123_ref27","first-page":"299","article-title":"Query-based forum posts extraction and refinement","volume":"1","author":"Hussain","year":"2014","journal-title":"International Journal on Engineering Technology and Sciences \u2013 IJETS"},{"first-page":"604","year":"2006","author":"Hovy","key":"S1351324919000123_ref26"},{"key":"S1351324919000123_ref23","doi-asserted-by":"publisher","DOI":"10.4304\/jetwi.2.3.258-268"},{"first-page":"19","year":"2001","author":"Gong","key":"S1351324919000123_ref22"},{"first-page":"270","year":"2015","author":"Giannakopoulos","key":"S1351324919000123_ref21"},{"first-page":"181","year":"2010","author":"Dupret","key":"S1351324919000123_ref19"},{"key":"S1351324919000123_ref16","first-page":"192","article-title":"A survey on automatic text summarization","volume":"4","author":"Das","year":"2007","journal-title":"Literature Survey for the Language and Statistics II course at CMU"},{"key":"S1351324919000123_ref15","first-page":"1","volume":"2005","author":"Dang","year":"2005"},{"first-page":"2108","year":"2014","author":"Chowdhury","key":"S1351324919000123_ref13"},{"first-page":"2779","year":"2015","author":"Chowdhury","key":"S1351324919000123_ref12"},{"key":"S1351324919000123_ref48","doi-asserted-by":"publisher","DOI":"10.1561\/1500000015"},{"year":"2016","author":"Cheng","key":"S1351324919000123_ref11"},{"first-page":"2153","year":"2015a","author":"Cao","key":"S1351324919000123_ref8"},{"key":"S1351324919000123_ref7","first-page":"1300","volume-title":"Twenty-Fourth Conference on Artificial Intelligence","volume":"10","author":"Bhatia","year":"2010"},{"key":"S1351324919000123_ref46","first-page":"1611.04230.","article-title":"SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents","author":"Nallapati","year":"2016","journal-title":"arXiv preprint arXiv"},{"first-page":"42","year":"2016","author":"Barker","key":"S1351324919000123_ref5"},{"key":"S1351324919000123_ref3","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-20161-5_16"},{"key":"S1351324919000123_ref2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-30671-1_57"},{"key":"S1351324919000123_ref1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-6610"},{"first-page":"195","year":"2013","author":"Krishnamani","key":"S1351324919000123_ref31"},{"key":"S1351324919000123_ref53","unstructured":"Pembe, F.C. and G\u00fcng\u00f6r, T. (2007). Automated query-biased and structure-preserving text summarization on web documents. In Proceedings of the International Symposium on Innovations in Intelligent Systems and Applications, I\u02d9stanbul."},{"first-page":"1","year":"2016","author":"Dlikman","key":"S1351324919000123_ref18"},{"key":"S1351324919000123_ref17","first-page":"1406.3830","article-title":"Modelling, visualising and summarising documents with a single convolutional neural network","author":"Denil","year":"2014","journal-title":"arXiv preprint arXiv"},{"first-page":"35","year":"2011","author":"Teevan","key":"S1351324919000123_ref63"},{"first-page":"829","year":"2015b","author":"Cao","key":"S1351324919000123_ref9"},{"key":"S1351324919000123_ref14","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2017.02.006"},{"year":"2007","author":"Toutanova","key":"S1351324919000123_ref66"},{"first-page":"35","year":"2016","author":"Guy","key":"S1351324919000123_ref24"},{"first-page":"205","year":"2008","author":"Schilder","key":"S1351324919000123_ref59"},{"key":"S1351324919000123_ref69","first-page":"127","article-title":"Analyzing cancer forum discussions with text mining","author":"van Oortmerssen","year":"2017","journal-title":"Knowledge Representation for Health Care Process-Oriented Information Systems in Health Care Extraction and Processing of Rich Semantics from Medical Texts"},{"first-page":"74","year":"2004","author":"Lin","key":"S1351324919000123_ref36"},{"key":"S1351324919000123_ref33","unstructured":"Kusner, M. , Sun, Y. , Kolkin, N. and Weinberger, K. (2015). From word embeddings to document distances. In International Conference on Machine Learning (ICML), vol. 15. Lille, France. pp. 957\u2013966. http:\/\/proceedings.mlr.press."},{"key":"S1351324919000123_ref43","first-page":"1301.3781.","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov","year":"2013","journal-title":"arXiv preprint arXiv"},{"key":"S1351324919000123_ref73","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-017-9389-4"},{"first-page":"45","year":"2014","author":"Oya","key":"S1351324919000123_ref50"},{"key":"S1351324919000123_ref25","doi-asserted-by":"publisher","DOI":"10.1109\/2.881692"},{"first-page":"298","year":"2005","author":"Zhou","key":"S1351324919000123_ref81"},{"first-page":"335","year":"1998","author":"Carbonell","key":"S1351324919000123_ref10"},{"key":"S1351324919000123_ref45","doi-asserted-by":"crossref","first-page":"593","DOI":"10.21437\/Interspeech.2005-59","volume-title":"INTERSPEECH-2005","author":"Murray","year":"2005"},{"first-page":"704","year":"2009","author":"Amini","key":"S1351324919000123_ref4"},{"key":"S1351324919000123_ref82","first-page":"237","volume-title":"AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs","author":"Zhou","year":"2006"},{"key":"S1351324919000123_ref30","first-page":"941","volume":"1","author":"Kenter","year":"2016"},{"key":"S1351324919000123_ref41","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1115"},{"first-page":"345","year":"2012","author":"Powers","key":"S1351324919000123_ref55"},{"key":"S1351324919000123_ref34","doi-asserted-by":"publisher","DOI":"10.2307\/2529310"},{"first-page":"1383","year":"2015","author":"Yin","key":"S1351324919000123_ref77"},{"first-page":"84","year":"2006","author":"Park","key":"S1351324919000123_ref51"},{"key":"S1351324919000123_ref56","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2003.10.006"},{"key":"S1351324919000123_ref20","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1523"},{"key":"S1351324919000123_ref62","first-page":"448","article-title":"Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources","author":"Svore","year":"2007","journal-title":"EMNLP-CoNLL"},{"first-page":"2127","year":"2014","author":"Bhatia","key":"S1351324919000123_ref6"},{"key":"S1351324919000123_ref47","first-page":"1611.04244v1.","article-title":"Classify Or select: neural architectures for extractive document summarization","author":"Nallapati","year":"2016","journal-title":"arXiv preprint arXiv"},{"first-page":"99","year":"2010","author":"Marge","key":"S1351324919000123_ref40"},{"first-page":"599","year":"2014","author":"Llewellyn","key":"S1351324919000123_ref39"},{"first-page":"301","year":"2005","author":"Wan","key":"S1351324919000123_ref74"},{"key":"S1351324919000123_ref44","doi-asserted-by":"publisher","DOI":"10.1109\/ISSPIT.2006.270835"},{"volume-title":"Proceedings of 2016 IEEE Workshop on Spoken Language Technology","year":"2016","author":"Tsai","key":"S1351324919000123_ref67"},{"key":"S1351324919000123_ref60","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2011.91"},{"first-page":"201","year":"2008","author":"Liu","key":"S1351324919000123_ref38"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324919000123","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,16]],"date-time":"2022-09-16T01:25:41Z","timestamp":1663291541000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324919000123\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,16]]},"references-count":82,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1]]}},"alternative-id":["S1351324919000123"],"URL":"https:\/\/doi.org\/10.1017\/s1351324919000123","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2019,4,16]]}}}