{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,22]],"date-time":"2026-03-22T15:55:12Z","timestamp":1774194912833,"version":"3.50.1"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2010,9,1]],"date-time":"2010-09-01T00:00:00Z","timestamp":1283299200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2010,9]]},"abstract":"<jats:p>\n            Web search engines can perform poorly for long queries (i.e., those containing four or more terms), in part because of their high level of query specificity. The automatic assignment of labels to long queries can capture aspects of a user\u2019s search intent that may not be apparent from the terms in the query. This affords search result matching or reranking based on queries and labels rather than the query text alone. Query labels can be derived from interaction logs generated from many users\u2019 search result clicks or from\n            <jats:italic>query trails<\/jats:italic>\n            comprising the chain of URLs visited following query submission. However, since long queries are typically rare, they are difficult to label in this way because little or no historic log data exists for them. A subset of these queries may be amenable to labeling by detecting similarities between parts of a long and rare query and the queries which appear in logs. In this article, we present the comparison of four similarity algorithms for the automatic assignment of Open Directory Project category labels to long and rare queries, based solely on matching against similar satisfied query trails extracted from log data. Our findings show that although the similarity-matching algorithms we investigated have tradeoffs in terms of coverage and accuracy, one algorithm that bases similarity on a popular search result ranking function (effectively regarding potentially-similar queries as \u201cdocuments\u201d) outperforms the others. We find that it is possible to correctly predict the top label better than one in five times, even when no past query trail exactly matches the long and rare query. We show that these labels can be used to reorder top-ranked search results leading to a significant improvement in retrieval performance over baselines that do not utilize query labeling, but instead rank results using content-matching or click-through logs. The outcomes of our research have implications for search providers attempting to provide users with highly-relevant search results for long queries.\n          <\/jats:p>","DOI":"10.1145\/1841909.1841912","type":"journal-article","created":{"date-parts":[[2010,10,5]],"date-time":"2010-10-05T14:38:15Z","timestamp":1286289495000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":25,"title":["Mining Historic Query Trails to Label Long and Rare Search Engine Queries"],"prefix":"10.1145","volume":"4","author":[{"given":"Peter","family":"Bailey","sequence":"first","affiliation":[{"name":"Microsoft"}]},{"given":"Ryen W.","family":"White","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Han","family":"Liu","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}]},{"given":"Giridhar","family":"Kumaran","sequence":"additional","affiliation":[{"name":"Microsoft"}]}],"member":"320","published-online":{"date-parts":[[2010,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148177"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 5th Text Retrieval Conference (TREC). NIST, 119--132","author":"Allan J."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564430"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2005.80"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1229179.1229183"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390419"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 19th Annual Australasian Database Conference. 65--74","author":"Bennett G."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772703"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367497.1367505"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242675"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277783"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(94)00050-D"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1076034.1076067"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564474"},{"key":"e_1_2_1_15_1","unstructured":"}}Cohen J. 1988. Statistical Power Analysis for the Behavioral Sciences 2nd Ed. Lawrence Earlbaum.  }} Cohen J. 1988. Statistical Power Analysis for the Behavioral Sciences 2nd Ed. Lawrence Earlbaum."},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of EMNLP-CoNLL. 708--716","author":"Cucerzan S.","year":"2007"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/371920.372165"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/956863.956925"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1117454.1117468"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the HLT-NAACL. 220--227","author":"Kumaran G."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390339"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1572038"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00958-7_11"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390393"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1117454.1117466"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 29th European Conference on Information Retrieval. 16--27","author":"Metzler D."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277823"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277870"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291008"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1135777.1135883"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 3rd Text REtrieval Conference (TREC\u201994)","author":"Robertson S."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1031171.1031181"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1117454.1117467"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1062745.1062889"},{"key":"e_1_2_1_36_1","volume-title":"Indri: A language-model based search engine for complex queries (extended version). IR 407, U. Massachusetts.","author":"Strohman T.","year":"2005"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1117454.1117469"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/188490.188508"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1572005"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242576"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.384019"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1841909.1841912","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1841909.1841912","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:08:57Z","timestamp":1750248537000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1841909.1841912"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,9]]},"references-count":41,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2010,9]]}},"alternative-id":["10.1145\/1841909.1841912"],"URL":"https:\/\/doi.org\/10.1145\/1841909.1841912","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"value":"1559-1131","type":"print"},{"value":"1559-114X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,9]]},"assertion":[{"value":"2009-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-09-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}