{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T19:53:28Z","timestamp":1759694008308,"version":"3.41.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2015,2,17]],"date-time":"2015-02-17T00:00:00Z","timestamp":1424131200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"EU FET projects MULTIPLEX 317532 and SIMPOL 610704"},{"name":"EU FP7 project 255403-SNAPS"},{"name":"PRIN 2008 research projects COGENT (COmputational and GamE-theoretic aspects of uncoordinated NeTworks) and Mad Web (Models, Algorithms and Data structures for the Web and other behavioral networks)"},{"name":"EU ERC StG project PAAl 259515"},{"name":"Italian Ministry of University and Research"},{"name":"Google Focused Award &ldquo;Algorithms for Large-Scale Data Analysis&rdquo;"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2015,3,23]]},"abstract":"<jats:p>We design algorithms that, given a collection of documents and a distribution over user queries, return a small subset of the document collection in such a way that we can efficiently provide high-quality answers to user queries using only the selected subset. This approach has applications when space is a constraint or when the query-processing time increases significantly with the size of the collection. We study our algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction of the entire collection, they can provide answers to most user queries, achieving a performance close to the optimal. To complement our theoretical findings, we experimentally show the versatility of our approach by considering two important cases in the context of Web search. In the first case, we favor the retrieval of documents that are relevant to the query, whereas in the second case we aim for document diversification. Both the theoretical and the experimental analysis provide strong evidence of the potential value of query covering in diverse application scenarios.<\/jats:p>","DOI":"10.1145\/2699671","type":"journal-article","created":{"date-parts":[[2015,2,18]],"date-time":"2015-02-18T13:24:05Z","timestamp":1424265845000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Stochastic Query Covering for Fast Approximate Document Retrieval"],"prefix":"10.1145","volume":"33","author":[{"given":"Aris","family":"Anagnostopoulos","sequence":"first","affiliation":[{"name":"Sapienza, University of Rome, Italy"}]},{"given":"Luca","family":"Becchetti","sequence":"additional","affiliation":[{"name":"Sapienza, University of Rome, Italy"}]},{"given":"Ilaria","family":"Bordino","sequence":"additional","affiliation":[{"name":"Yahoo Labs, Barcelona, Spain"}]},{"given":"Stefano","family":"Leonardi","sequence":"additional","affiliation":[{"name":"Sapienza, University of Rome, Italy"}]},{"given":"Ida","family":"Mele","sequence":"additional","affiliation":[{"name":"Sapienza, University of Rome, Italy"}]},{"given":"Piotr","family":"Sankowski","sequence":"additional","affiliation":[{"name":"University of Warsaw, Poland"}]}],"member":"320","published-online":{"date-parts":[[2015,2,17]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304207"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498759.1498766"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/780542.780558"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1935826.1935923"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-69507-3_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2007.367846"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1409220.1409223"},{"key":"e_1_2_1_8_1","series-title":"Lecture Notes in Computer Science","volume-title":"String Processing and Information Retrieval","author":"Baeza-Yates Ricardo","unstructured":"Ricardo Baeza-Yates , Flavio Junqueira , Vassilis Plachouras , and Hans Friedrich Witschel . 2007. Admission policies for caches of search engine results . In String Processing and Information Retrieval . Lecture Notes in Computer Science , Vol. 4726 . Springer , 74--85. DOI:http:\/\/dx.doi.org\/10.1007\/978-3-540-75530-2_7 10.1007\/978-3-540-75530-2_7 Ricardo Baeza-Yates, Flavio Junqueira, Vassilis Plachouras, and Hans Friedrich Witschel. 2007. Admission policies for caches of search engine results. In String Processing and Information Retrieval. Lecture Notes in Computer Science, Vol. 4726. Springer, 74--85. DOI:http:\/\/dx.doi.org\/10.1007\/978-3-540-75530-2_7"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1571971"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2003.1196112"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458163"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/WI-IAT.2009.34"},{"key":"e_1_2_1_13_1","first-page":"331","article-title":"Information retrieval: A sequential learning process","volume":"34","author":"Bookstein Abraham","year":"1983","unstructured":"Abraham Bookstein . 1983 . Information retrieval: A sequential learning process . Journal of the Association for Information Science and Technology 34 , 5, 331 -- 342 . Abraham Bookstein. 1983. Information retrieval: A sequential learning process. Journal of the Association for Information Science and Technology 34, 5, 331--342.","journal-title":"Journal of the Association for Information Science and Technology"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(82)90033-4"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.1080.0363"},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Rajkumar Buyya Mukaddim Pathan and Athena Vakali (Eds.). 2008. Content Delivery Networks. Springer Berlin Germany.   Rajkumar Buyya Mukaddim Pathan and Athena Vakali (Eds.). 2008. Content Delivery Networks. Springer Berlin Germany.","DOI":"10.1007\/978-3-540-77887-5"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1572013"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/1988776.1988781"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291025"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1951365.1951379"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148245"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559795.1559815"},{"key":"e_1_2_1_23_1","volume-title":"Overview of the TREC 2009 Web track. In Proceedings of TREC.","author":"Clarke Charles L. A.","year":"2009","unstructured":"Charles L. A. Clarke , Nick Craswell , and Ian Soboroff . 2009 . Overview of the TREC 2009 Web track. In Proceedings of TREC. Charles L. A. Clarke, Nick Craswell, and Ian Soboroff. 2009. Overview of the TREC 2009 Web track. In Proceedings of TREC."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390446"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564699"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1125857.1125859"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526768"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564746"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0271(64)90006-3"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2008.31"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060590.1060649"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.11.003"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/WI-IAT.2010.139"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/775152.775156"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of NIPS 23","author":"Leung Gilbert","year":"2010","unstructured":"Gilbert Leung , Novi Quadrianto , Alexander Smola , and Kostas Tsioutsiouliklis . 2010 . Optimal Web-scale tiering as a flow problem . In Proceedings of NIPS 23 . 1333--1341. Gilbert Leung, Novi Quadrianto, Alexander Smola, and Kostas Tsioutsiouliklis. 2010. Optimal Web-scale tiering as a flow problem. In Proceedings of NIPS 23. 1333--1341."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0140-3664(00)00308-X"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277776"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146847.1146848"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390255"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772770"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/215206.215381"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/951953.952384"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772780"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.383959"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/331403.331405"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1002\/1097-4571(2000)9999:9999%3C::AID-ASI1591%3E3.3.CO;2-I"},{"volume-title":"Approximation Algorithms","author":"Vazirani Vijay V.","key":"e_1_2_1_48_1","unstructured":"Vijay V. Vazirani . 2001. Approximation Algorithms . Springer-Verlag , New York, NY . Vijay V. Vazirani. 2001. Approximation Algorithms. Springer-Verlag, New York, NY."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2008.4497431"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1571963"},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of IEEE Infocom\u201902","author":"Xie Yinglian","year":"2002","unstructured":"Yinglian Xie and David O\u2019Hallaron . 2002 . Locality in search engine queries and its implications for caching . In Proceedings of IEEE Infocom\u201902 . 1238--1247. Yinglian Xie and David O\u2019Hallaron. 2002. Locality in search engine queries and its implications for caching. In Proceedings of IEEE Infocom\u201902. 1238--1247."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060745.1060754"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2699671","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2699671","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:16:59Z","timestamp":1750227419000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2699671"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,2,17]]},"references-count":52,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2015,3,23]]}},"alternative-id":["10.1145\/2699671"],"URL":"https:\/\/doi.org\/10.1145\/2699671","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2015,2,17]]},"assertion":[{"value":"2014-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-02-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}