{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,16]],"date-time":"2026-07-16T05:00:22Z","timestamp":1784178022818,"version":"3.55.0"},"reference-count":70,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2022,12,21]],"date-time":"2022-12-21T00:00:00Z","timestamp":1671580800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001602","name":"Science Foundation Ireland","doi-asserted-by":"crossref","award":["SFI\/12\/RC\/2289_P2"],"award-info":[{"award-number":["SFI\/12\/RC\/2289_P2"]}],"id":[{"id":"10.13039\/501100001602","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2023,4,30]]},"abstract":"<jats:p>Query performance prediction (QPP) methods, which aim to predict the performance of a query, often rely on evidences in the form of different characteristic patterns in the distribution of Retrieval Status Values (RSVs). However, for neural IR models, it is usually observed that the RSVs are often less reliable for QPP because they are bounded within short intervals, different from the situation for statistical models. To address this limitation, we propose a model-agnostic QPP framework that gathers additional evidences by leveraging information from the characteristic patterns of RSV distributions computed over a set of<jats:italic>automatically generated<\/jats:italic>query variants, relative to that of the current query. Specifically, the idea behind our proposed method\u2014Weighted Relative Information Gain (WRIG), is that a substantial relative decrease or increase in the standard deviation of the RSVs of the query variants is likely to be a relative indicator of how easy or difficult the original query is. To cater for the absence of human-annotated query variants in real-world scenarios, we further propose an automatic query variant generation method. This can produce variants in a controlled manner by substituting terms from the original query with new ones sampled from a weighted distribution, constructed either via a relevance model or with the help of an embedded representation of query terms. Our experiments on the TREC-Robust, ClueWeb09B, and MS MARCO\u00a0datasets show that WRIG, by the use of this relative changes in QPP estimate, leads to significantly better results than a state-of-the-art baseline method that leverages information from (manually created) query variants by the application of additive smoothing [<jats:xref ref-type=\"bibr\">64<\/jats:xref>]. The results also show that our approach can improve the QPP effectiveness of neural retrieval approaches in particular.<\/jats:p>","DOI":"10.1145\/3545112","type":"journal-article","created":{"date-parts":[[2022,6,23]],"date-time":"2022-06-23T08:59:19Z","timestamp":1655974759000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":25,"title":["A Relative Information Gain-based Query Performance Prediction Framework with Generated Query Variants"],"prefix":"10.1145","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9220-6652","authenticated-orcid":false,"given":"Suchana","family":"Datta","sequence":"first","affiliation":[{"name":"University College Dublin, Belfield, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0050-7138","authenticated-orcid":false,"given":"Debasis","family":"Ganguly","sequence":"additional","affiliation":[{"name":"University of Glasgow, Glasgow, Scotland, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9045-9971","authenticated-orcid":false,"given":"Mandar","family":"Mitra","sequence":"additional","affiliation":[{"name":"Indian Statistical Institute, Dunlop, Kolkata, West Bengal, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8065-5418","authenticated-orcid":false,"given":"Derek","family":"Greene","sequence":"additional","affiliation":[{"name":"University College Dublin, Belfield, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,12,21]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"2021. Waterloo Spam Rankings for the ClueWeb09 Dataset. Retrieved from https:\/\/plg.uwaterloo.ca\/gvcormac\/clueweb09spam\/."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331246"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482063"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-010-9145-5"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914671"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080839"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(94)00057-A"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390377"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835683"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148238"},{"key":"e_1_3_2_12_2","series-title":"NIST Special Publication","volume-title":"TREC","author":"Carterette Ben","year":"2014","unstructured":"Ben Carterette, Evangelos Kanoulas, Mark M. Hall, and Paul D. Clough. 2014. Overview of the TREC 2014 session track. In TREC(NIST Special Publication, Vol. 500-308). National Institute of Standards and Technology (NIST)."},{"key":"e_1_3_2_13_2","first-page":"125","volume-title":"Retrievability-based Document Selection for Relevance Feedback with Automatically Generated Query Variants","author":"Chakraborty Anirban","year":"2020","unstructured":"Anirban Chakraborty, Debasis Ganguly, and Owen Conlan. 2020. Retrievability-based Document Selection for Relevance Feedback with Automatically Generated Query Variants. Association for Computing Machinery, New York, NY, 125\u2013134."},{"key":"e_1_3_2_14_2","series-title":"NIST Special Publication","volume-title":"TREC","author":"Clarke Charles L. A.","year":"2010","unstructured":"Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Gordon V. Cormack. 2010. Overview of the TREC 2010 web track. In TREC(NIST Special Publication, Vol. 500-294). National Institute of Standards and Technology (NIST)."},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564429"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-006-9006-4"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/2559170"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2010063"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159659"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3488560.3498491"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080832"},{"key":"e_1_3_2_22_2","first-page":"4171","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 4171\u20134186."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277841"},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1007\/978-3-030-72113-8_8","volume-title":"Advances in Information Retrieval","author":"Faggioli Guglielmo","year":"2021","unstructured":"Guglielmo Faggioli, Oleg Zendel, J. Shane Culpepper, Nicola Ferro, and Falk Scholer. 2021. An enhanced evaluation framework for query performance prediction. In Advances in Information Retrieval. Springer International Publishing, Cham, 115\u2013129."},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/2484028.2484069"},{"key":"e_1_3_2_26_2","series-title":"Proceedings of the 14th International Conference on Artificial Intelligence and Statistics","first-page":"315","volume":"15","author":"Glorot Xavier","year":"2011","unstructured":"Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep sparse rectifier neural networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 15), Geoffrey Gordon, David Dunson, and Miroslav Dud\u00edk (Eds.). PMLR, Fort Lauderdale, FL, 315\u2013323."},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983769"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1561\/1500000050"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/1842890.1842906"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458311"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458311"},{"key":"e_1_3_2_32_2","first-page":"43","volume-title":"String Processing and Information Retrieval","author":"He Ben","year":"2004","unstructured":"Ben He and Iadh Ounis. 2004. Inferring query performance using pre-retrieval predictors. In String Processing and Information Retrieval. Springer Berlin, 43\u201354."},{"key":"e_1_3_2_33_2","series-title":"NIST Special Publication","volume-title":"TREC","author":"Jaleel Nasreen Abdul","year":"2004","unstructured":"Nasreen Abdul Jaleel, James Allan, W. Bruce Croft, Fernando Diaz, Leah S. Larkey, Xiaoyan Li, Mark D. Smucker, and Courtney Wade. 2004. UMass at TREC 2004: Novelty and HARD. In TREC(NIST Special Publication, Vol. 500-261). National Institute of Standards and Technology (NIST)."},{"key":"e_1_3_2_34_2","first-page":"19","volume-title":"A Quantum Interference Inspired Neural Matching Model for Ad-Hoc Retrieval","author":"Jiang Yongyu","year":"2020","unstructured":"Yongyu Jiang, Peng Zhang, Hui Gao, and Dawei Song. 2020. A Quantum Interference Inspired Neural Matching Model for Ad-Hoc Retrieval. Association for Computing Machinery, New York, NY, 19\u201328."},{"key":"e_1_3_2_35_2","first-page":"39","volume-title":"ColBERT: Efficient and Effective Passage Search Via Contextualized Late Interaction Over BERT","author":"Khattab Omar","year":"2020","unstructured":"Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search Via Contextualized Late Interaction Over BERT. Association for Computing Machinery, New York, NY, 39\u201348."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/2040317.2040323"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/383952.383972"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2396768"},{"key":"e_1_3_2_39_2","first-page":"3111","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, 3111\u20133119."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2609508"},{"key":"e_1_3_2_41_2","first-page":"59","volume-title":"A Reinforcement Learning Framework for Relevance Feedback","author":"Montazeralghaem Ali","year":"2020","unstructured":"Ali Montazeralghaem, Hamed Zamani, and James Allan. 2020. A Reinforcement Learning Framework for Relevance Feedback. Association for Computing Machinery, New York, NY, 59\u201368."},{"key":"e_1_3_2_42_2","article-title":"MS MARCO: A human generated machine reading comprehension dataset.","volume":"1611","author":"Nguyen Tri","year":"2016","unstructured":"Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. CoRR abs\/1611.09268 (2016).","journal-title":"CoRR"},{"key":"e_1_3_2_43_2","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1007\/978-3-642-16321-0_21","volume-title":"String Processing and Information Retrieval","author":"P\u00e9rez-Iglesias Joaqu\u00edn","year":"2010","unstructured":"Joaqu\u00edn P\u00e9rez-Iglesias and Lourdes Araujo. 2010. Standard deviation as a query hardness estimator. In String Processing and Information Retrieval, Edgar Chavez and Stefano Lonardi (Eds.). Springer Berlin, 207\u2013212."},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291008"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1002\/pra2.2017.14505401037"},{"key":"e_1_3_2_46_2","doi-asserted-by":"crossref","unstructured":"S. E. Robertson S. Walker M. M. Beaulieu M. Gatford and A. Payne. 1996. Okapi at TREC-4. In Proceedings of the Fourth Text Retrieval Conference (TREC-4\u201996) . NIST 73\u201396. https:\/\/www.microsoft.com\/en-us\/research\/publication\/okapi-at-trec-4\/.","DOI":"10.6028\/NIST.SP.500-236.city"},{"key":"e_1_3_2_47_2","first-page":"281","volume-title":"The Probability Ranking Principle in IR","author":"Robertson S. E.","year":"1997","unstructured":"S. E. Robertson. 1997. The Probability Ranking Principle in IR. Morgan Kaufmann Publishers Inc., San Francisco, CA, 281\u2013286."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080665"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331334"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3121050.3121087"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331369"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331369"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983750"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2018.10.009"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835494"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/2926790"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/2180868.2180873"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661906"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3166072.3166079"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/1852102.1852106"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080809"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/1076034.1076121"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3209978.3210041"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983844"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331253"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/383952.384019"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1145\/984321.984322"},{"key":"e_1_3_2_68_2","first-page":"1941","volume-title":"An Analysis of BERT in Document Ranking","author":"Zhan Jingtao","year":"2020","unstructured":"Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2020. An Analysis of BERT in Document Ranking. Association for Computing Machinery, New York, NY, 1941\u20131944."},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-78646-7_8"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183696"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277835"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545112","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3545112","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:44Z","timestamp":1750186964000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545112"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,21]]},"references-count":70,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,4,30]]}},"alternative-id":["10.1145\/3545112"],"URL":"https:\/\/doi.org\/10.1145\/3545112","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"value":"1046-8188","type":"print"},{"value":"1558-2868","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,21]]},"assertion":[{"value":"2021-10-28","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-06-14","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-12-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}