{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,21]],"date-time":"2025-10-21T00:38:57Z","timestamp":1761007137224,"version":"build-2065373602"},"reference-count":31,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2013,1,24]],"date-time":"2013-01-24T00:00:00Z","timestamp":1358985600000},"content-version":"vor","delay-in-days":389,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc of Assoc for Info"],"published-print":{"date-parts":[[2012,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Search result diversification enables the modern day search engines to construct a result list that consists of documents that are relevant to the user query and at the same time, diverse enough to meet the expectations of a diverse user population. However, all the queries received by a search engine may not benefit from diversification. Further, different types of queries may benefit from different diversification mechanisms. In this paper we present an analysis of logs of a commercial web search engine and study the web search queries for their diversification requirements. We analyze queries based on their click entropy and popularity and propose a query taxonomy based on their diversification requirements. We then carry out the task of automatically classifying web search queries into one of the classes of our proposed taxonomy. We utilize various query\u2010based, click\u2010based and reformulation\u2010based features for the query classification task and achieve strong classification results.<\/jats:p>","DOI":"10.1002\/meet.14504901188","type":"journal-article","created":{"date-parts":[[2013,1,24]],"date-time":"2013-01-24T10:49:23Z","timestamp":1359024563000},"page":"1-10","source":"Crossref","is-referenced-by-count":5,"title":["Analysis and automatic classification of web search queries for diversification requirements"],"prefix":"10.1002","volume":"49","author":[{"given":"Sumit","family":"Bhatia","sequence":"first","affiliation":[]},{"given":"Cliff","family":"Brunk","sequence":"additional","affiliation":[]},{"given":"Prasenjit","family":"Mitra","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2013,1,24]]},"reference":[{"key":"e_1_2_8_2_1","doi-asserted-by":"crossref","unstructured":"Agrawal R. Gollapudi S. Halverson A.&Ieong S.(2009) Diversifying search results in\u2018WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining\u2019 ACM pp.5\u201314.","DOI":"10.1145\/1498759.1498766"},{"key":"e_1_2_8_3_1","doi-asserted-by":"crossref","unstructured":"Beitzel S. M. Jensen E. C. Chowdhury A. Grossman D.&Frieder O.(2004) Hourly analysis of a very large topically categorized web query log in\u2018Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval\u2019 SIGIR '04 ACM New York NY USA pp.321\u2013328.http:\/\/doi.acm.org\/10.1145\/1008992.1009048","DOI":"10.1145\/1008992.1009048"},{"key":"e_1_2_8_4_1","doi-asserted-by":"crossref","unstructured":"Bendersky M.&Croft W. B.(2008) Discovering key concepts in verbose queries in\u2018Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval\u2019 SIGIR '08 ACM New York NY USA pp.491\u2013498.http:\/\/doi.acm.org\/10.1145\/1390334.1390419","DOI":"10.1145\/1390334.1390419"},{"key":"e_1_2_8_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(82)90033-4"},{"key":"e_1_2_8_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/792550.792552"},{"key":"e_1_2_8_7_1","doi-asserted-by":"crossref","unstructured":"Carbonell J.&Goldstein J.(1998) The use of mmr diversity\u2010based reranking for reordering documents and producing summaries in\u2018SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval\u2019 ACM New York NY USA pp.335\u2013336.","DOI":"10.1145\/290941.291025"},{"key":"e_1_2_8_8_1","doi-asserted-by":"crossref","unstructured":"Chen H.&Karger D. R.(2006) Less is more: probabilistic models for retrieving fewer relevant documents in\u2018Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval\u2019 SIGIR '06 ACM New York NY USA pp.429\u2013436.http:\/\/doi.acm.org\/10.1145\/1148170.1148245","DOI":"10.1145\/1148170.1148245"},{"key":"e_1_2_8_9_1","doi-asserted-by":"crossref","unstructured":"Clough P. Sanderson M. Abouammoh M. Navarro S.&Paramita M.(2009) Multiple approaches to analysing query diversity in\u2018SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval\u2019 ACM New York NY USA pp.734\u2013735.","DOI":"10.1145\/1571941.1572102"},{"key":"e_1_2_8_10_1","doi-asserted-by":"crossref","unstructured":"Cronen\u2010Townsend S. Zhou Y.&Croft W. B.(2002) Predicting query performance in\u2018SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval\u2019 ACM New York NY USA pp.299\u2013306.","DOI":"10.1145\/564376.564429"},{"key":"e_1_2_8_11_1","doi-asserted-by":"crossref","unstructured":"Dai H. K. Zhao L. Nie Z. Wen J.\u2010R. Wang L.&Li Y.(2006) Detecting online commercial intention (oci) in\u2018Proceedings of the 15th international conference on World Wide Web\u2019 WWW y06 ACM New York NY USA pp.829\u2013837.http:\/\/doi.acm.org\/10.1145\/1135777.1135902","DOI":"10.1145\/1135777.1135902"},{"key":"e_1_2_8_12_1","doi-asserted-by":"crossref","unstructured":"Dou Z. Song R.&Wen J.\u2010R.(2007) A large\u2010scale evaluation and analysis of personalized search strategies in\u2018Proceedings of the 16th international conference on World Wide Web\u2019 WWW '07 ACM New York NY USA pp.581\u2013590.","DOI":"10.1145\/1242572.1242651"},{"key":"e_1_2_8_13_1","doi-asserted-by":"crossref","unstructured":"Gollapudi S.&Sharma A.(2009) An axiomatic approach for result diversification in\u2018WWW '09: Proceedings of the 18th international conference on World wide web\u2019 ACM New York NY USA pp.381\u2013390.","DOI":"10.1145\/1526709.1526761"},{"key":"e_1_2_8_14_1","doi-asserted-by":"crossref","unstructured":"Gravano L. Hatzivassiloglou V.&Lichtenstein R.(2003) Categorizing web queries according to geographical locality in\u2018Proceedings of the twelfth international conference on Information and knowledge management\u2019 CIKM '03 ACM New York NY USA pp.325\u2013333.http:\/\/doi.acm.org\/10.1145\/956863.956925","DOI":"10.1145\/956863.956925"},{"key":"e_1_2_8_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1656274.1656278"},{"key":"e_1_2_8_16_1"},{"key":"e_1_2_8_17_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.21071"},{"key":"e_1_2_8_18_1","unstructured":"Jansen B. J.&Spink A.(2003) An analysis of web documents retrieved and viewed inH. R. Arabnia & Y. Mun eds \u2018International Conference on Internet Computing\u2019 CSREA Press pp.65\u201369."},{"key":"e_1_2_8_19_1"},{"key":"e_1_2_8_20_1","doi-asserted-by":"crossref","unstructured":"Kang I.\u2010H.&Kim G.(2003) Query type classification for web document retrieval in\u2018Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval\u2019 SIGIR '03 ACM New York NY USA pp.64\u201371.http:\/\/doi.acm.org\/10.1145\/860435.860449","DOI":"10.1145\/860435.860449"},{"key":"e_1_2_8_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1117454.1117466"},{"key":"e_1_2_8_22_1","doi-asserted-by":"crossref","unstructured":"Lu Y. Peng F. Wei X.&Dumoulin B.(2010) Personalize web search results with user's location in\u2018Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval\u2019 SIGIR '10 ACM New York NY USA pp.763\u2013764.http:\/\/doi.acm.org\/10.1145\/1835449.1835604","DOI":"10.1145\/1835449.1835604"},{"key":"e_1_2_8_23_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630270302"},{"key":"e_1_2_8_24_1"},{"key":"e_1_2_8_25_1","doi-asserted-by":"crossref","unstructured":"Sanderson M.(2008) Ambiguous queries: test collections need more sense in\u2018SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval\u2019 ACM New York NY USA pp.499\u2013506.","DOI":"10.1145\/1390334.1390420"},{"key":"e_1_2_8_26_1","doi-asserted-by":"crossref","unstructured":"Santos R. L. Macdonald C.&Ounis I.(2010a) Exploiting query reformulations for web search result diversification in\u2018WWW '10: Proceedings of the 19th international conference on World wide web\u2019 ACM New York NY USA pp.881\u2013890.","DOI":"10.1145\/1772690.1772780"},{"key":"e_1_2_8_27_1","doi-asserted-by":"crossref","unstructured":"Santos R. L. Macdonald C.&Ounis I.(2010b) Selectively diversifying web search results in\u2018Proceedings of the 1 9th ACM international conference on Information and knowledge management\u2019 CIKM y10 ACM pp.1179\u20131188.http:\/\/doi.acm.org\/10.1145\/1871437.1871586","DOI":"10.1145\/1871437.1871586"},{"key":"e_1_2_8_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/331403.331405"},{"key":"e_1_2_8_29_1","doi-asserted-by":"crossref","unstructured":"Teevan J. Dumais S. T.&Liebling D. J.(2008) To personalize or not to personalize: modeling queries with variation in user intent in\u2018Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval\u2019 SIGIR '08 ACM New York NY USA pp.163\u2013170.http:\/\/doi.acm.org\/10.1145\/1390334.1390364","DOI":"10.1145\/1390334.1390364"},{"key":"e_1_2_8_30_1","doi-asserted-by":"crossref","unstructured":"Wang J.&Zhu J.(2009) Portfolio theory of information retrieval in\u2018SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval\u2019 ACM New York NY USA pp.115\u2013122.","DOI":"10.1145\/1571941.1571963"},{"key":"e_1_2_8_31_1","unstructured":"Wang Y.&Agichtein E.(2010) Query ambiguity revisited: clickthrough measures for distinguishing informational and ambiguous queries in\u2018Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics\u2019 HLT '10 Association for Computational Linguistics Stroudsburg PA USA pp.361\u2013364.http:\/\/portal.acm.org\/citation.cfm?id=1857999.1858054"},{"key":"e_1_2_8_32_1","doi-asserted-by":"crossref","unstructured":"Welch M. J. Cho J.&Olston C.(2011) Search result diversity for informational queries in\u2018Proceedings of the 20th international conference on World wide web\u2019 WWW '11 ACM New York NY USA pp.237\u2013246.http:\/\/doi.acm.org\/10.1145\/1963405.1963441","DOI":"10.1145\/1963405.1963441"}],"container-title":["Proceedings of the American Society for Information Science and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fmeet.14504901188","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fmeet.14504901188","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/meet.14504901188","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T11:34:59Z","timestamp":1760960099000},"score":1,"resource":{"primary":{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/10.1002\/meet.14504901188"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,1]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,1]]}},"alternative-id":["10.1002\/meet.14504901188"],"URL":"https:\/\/doi.org\/10.1002\/meet.14504901188","archive":["Portico"],"relation":{},"ISSN":["0044-7870","1550-8390"],"issn-type":[{"type":"print","value":"0044-7870"},{"type":"electronic","value":"1550-8390"}],"subject":[],"published":{"date-parts":[[2012,1]]}}}