{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:12:05Z","timestamp":1750306325023,"version":"3.41.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2016,8,16]],"date-time":"2016-08-16T00:00:00Z","timestamp":1471305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"European Commission Seventh Framework Program","award":["600826"],"award-info":[{"award-number":["600826"]}]},{"name":"German Federal Ministry of Education"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2016,8,29]]},"abstract":"<jats:p>\n            It has been shown that top-\n            <jats:italic>k<\/jats:italic>\n            retrieval quality can be considerably improved by taking not only relevance but also diversity into account. However, currently proposed diversification approaches have not put much attention on practical usability in large-scale settings, such as modern web search systems. In this work, we make two contributions toward this goal. First, we propose a combination of optimizations and heuristics for an implicit diversification algorithm based on the desirable facility placement principle, and present two algorithms that achieve linear complexity without compromising the retrieval effectiveness. Instead of an exhaustive comparison of documents, these algorithms first perform a clustering phase and then exploit its outcome to compose the diverse result set. Second, we describe and analyze two variants for distributed diversification in a computing cluster, for large-scale IR where the document collection is too large to keep in one node. Our contribution in this direction is pioneering, as there exists no earlier work in the literature that investigates the effectiveness and efficiency of diversification on a distributed setup. Extensive evaluations on a standard TREC framework demonstrate a competitive retrieval quality of the proposed optimizations to the baseline algorithm while reducing the processing time by more than 80% and up to 97%, and shed light on the efficiency and effectiveness tradeoffs of diversification when applied on top of a distributed architecture.\n          <\/jats:p>","DOI":"10.1145\/2907948","type":"journal-article","created":{"date-parts":[[2016,8,16]],"date-time":"2016-08-16T19:02:14Z","timestamp":1471374134000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Scalable and Efficient Web Search Result Diversification"],"prefix":"10.1145","volume":"10","author":[{"given":"Kaweh Djafari","family":"Naini","sequence":"first","affiliation":[{"name":"L3S Research Center, Hannover, Germany"}]},{"given":"Ismail Sengor","family":"Altingovde","sequence":"additional","affiliation":[{"name":"Middle East Technical University, Ankara, Turkey"}]},{"given":"Wolf","family":"Siberski","sequence":"additional","affiliation":[{"name":"Bundesdruckerei GmbH"}]}],"member":"320","published-online":{"date-parts":[[2016,8,16]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498759.1498766"},{"key":"e_1_2_1_2_1","series-title":"The Information Retrieval Series","volume-title":"Advanced Topics in Information Retrieval, Massimo Melucci and Ricardo Baeza-Yates (Eds.)","author":"Cambazoglu Berkant Barla","unstructured":"Berkant Barla Cambazoglu and Ricardo Baeza-Yates . 2011. Scalability challenges in web search engines . In Advanced Topics in Information Retrieval, Massimo Melucci and Ricardo Baeza-Yates (Eds.) . The Information Retrieval Series , Vol. 33 . 27--50. Berkant Barla Cambazoglu and Ricardo Baeza-Yates. 2011. Scalability challenges in web search engines. In Advanced Topics in Information Retrieval, Massimo Melucci and Ricardo Baeza-Yates (Eds.). The Information Retrieval Series, Vol. 33. 27--50."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835467"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14778\/1988776.1988781"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291025"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2011.08.004"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04417-5_18"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646116"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-011-9167-7"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646033"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2004.11.014"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148245"},{"key":"e_1_2_1_13_1","volume-title":"Overview of the TREC 2009 web track. In Proceedings of the 18th Text Retrieval Conference.","author":"Clarke Charles L. A.","year":"2009","unstructured":"Charles L. A. Clarke , Nick Craswell , and Ian Soboroff . 2009 . Overview of the TREC 2009 web track. In Proceedings of the 18th Text Retrieval Conference. Charles L. A. Clarke, Nick Craswell, and Ian Soboroff. 2009. Overview of the TREC 2009 web track. In Proceedings of the 18th Text Retrieval Conference."},{"volume-title":"Overview of the TREC 2010 web track. In Proceedings of the 19th Text Retrieval Conference.","author":"Clarke Charles L. A.","key":"e_1_2_1_14_1","unstructured":"Charles L. A. Clarke , Nick Craswell , Ian Soboroff , and Gordon V. Cormack . 2010 . Overview of the TREC 2010 web track. In Proceedings of the 19th Text Retrieval Conference. Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Gordon V. Cormack. 2010. Overview of the TREC 2010 web track. In Proceedings of the 19th Text Retrieval Conference."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390446"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348296"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2484028.2484095"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498759.1498761"},{"key":"e_1_2_1_19_1","first-page":"49","article-title":"Diversity over continuous data","volume":"32","author":"Drosou Marina","year":"2009","unstructured":"Marina Drosou and Evaggelia Pitoura . 2009 . Diversity over continuous data . IEEE Data Eng. Bull. 32 , 4 (2009), 49 -- 56 . Marina Drosou and Evaggelia Pitoura. 2009. Diversity over continuous data. IEEE Data Eng. Bull. 32, 4 (2009), 49--56.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_20_1","volume-title":"On the impact of random index-partitioning on index compression. CoRR abs\/1107.5661","author":"Feldman Moran","year":"2011","unstructured":"Moran Feldman , Ronny Lempel , Oren Somekh , and Kolman Vornovitsky . 2011. On the impact of random index-partitioning on index compression. CoRR abs\/1107.5661 ( 2011 ). http:\/\/arxiv.org\/abs\/1107.5661. Moran Feldman, Ronny Lempel, Oren Somekh, and Kolman Vornovitsky. 2011. On the impact of random index-partitioning on index compression. CoRR abs\/1107.5661 (2011). http:\/\/arxiv.org\/abs\/1107.5661."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/2051073.2051107"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jda.2012.07.004"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526761"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.21468"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1124772.1124877"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1871437.1871497"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2609561"},{"key":"e_1_2_1_29_1","first-page":"77","article-title":"Portfolio selection","volume":"7","author":"Markowitz Harry","year":"1952","unstructured":"Harry Markowitz . 1952 . Portfolio selection . J. Finance 7 , 1 (1952), 77 -- 91 . Harry Markowitz. 1952. Portfolio selection. J. Finance 7, 1 (1952), 77--91.","journal-title":"J. Finance"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2009996"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2010.12.007"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661975"},{"key":"e_1_2_1_33_1","first-page":"1212","article-title":"Explicit search result diversification using score and rank aggregation methods","volume":"66","author":"Ozdemiray Ahmet Murat","year":"2015","unstructured":"Ahmet Murat Ozdemiray and Ismail Sengor Altingovde . 2015 . Explicit search result diversification using score and rank aggregation methods . JASIST 66 , 6 (2015), 1212 -- 1228 . DOI:http:\/\/dx.doi.org\/ 10.1002\/asi.23259 10.1002\/asi.23259 Ahmet Murat Ozdemiray and Ismail Sengor Altingovde. 2015. Explicit search result diversification using score and rank aggregation methods. JASIST 66, 6 (2015), 1212--1228. DOI:http:\/\/dx.doi.org\/ 10.1002\/asi.23259","journal-title":"JASIST"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146847.1146848"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146847.1146881"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148320"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb026647"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772780"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1871437.1871586"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2009997"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348396"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348297"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1571963"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2609451"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860440"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.11.003"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-12275-0_32"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28997-2_26"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2907948","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2907948","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:54:27Z","timestamp":1750222467000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2907948"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,16]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,8,29]]}},"alternative-id":["10.1145\/2907948"],"URL":"https:\/\/doi.org\/10.1145\/2907948","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"type":"print","value":"1559-1131"},{"type":"electronic","value":"1559-114X"}],"subject":[],"published":{"date-parts":[[2016,8,16]]},"assertion":[{"value":"2014-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-08-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}