{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T11:29:44Z","timestamp":1778498984662,"version":"3.51.4"},"reference-count":69,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2021,3,29]],"date-time":"2021-03-29T00:00:00Z","timestamp":1616976000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2023,2]]},"abstract":"<jats:p>This article presents a new query expansion (QE) method aiming to tackle term mismatch in information retrieval (IR). Previous research showed that selecting good expansion terms which do not hurt retrieval effectiveness remains an open and challenging research question. Our method investigates how global statistics of term co-occurrence can be used effectively to enhance expansion term selection and reweighting. Indeed, we build a co-occurrence graph using a context window approach over the entire collection, thus adopting a global QE approach. Then, we employ a semantic similarity measure inspired by the Okapi BM25 model, which allows to evaluate the discriminative power of words and to select relevant expansion terms based on their similarity to the query as a whole. The proposed method includes a reweighting step where selected terms are assigned weights according to their relevance to the query. What\u2019s more, our method does not require matrix factorisation or complex text mining processes. It only requires simple co-occurrence statistics about terms, which reduces complexity and insures scalability. Finally, it has two free parameters that may be tuned to adapt the model to the context of a given collection and control co-occurrence normalisation. Extensive experiments on four standard datasets of English (TREC Robust04 and Washington Post) and French (CLEF2000 and CLEF2003) show that our method improves both retrieval effectiveness and robustness in terms of various evaluation metrics and outperforms competitive state-of-the-art baselines with significantly better results. We also investigate the impact of varying the number of expansion terms on retrieval results.<\/jats:p>","DOI":"10.1177\/0165551521998047","type":"journal-article","created":{"date-parts":[[2021,3,30]],"date-time":"2021-03-30T00:33:44Z","timestamp":1617064424000},"page":"183-206","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["A discriminative method for global query expansion and term reweighting using co-occurrence graphs"],"prefix":"10.1177","volume":"49","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4309-572X","authenticated-orcid":false,"given":"Billel","family":"Aklouche","sequence":"first","affiliation":[{"name":"LISI Laboratory of Computer Science for Industrial System, INSAT, Carthage University, Tunis, Tunisia; National School of Computer Science (ENSI), Manouba University, Manouba, Tunisia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6310-7062","authenticated-orcid":false,"given":"Ibrahim","family":"Bounhas","sequence":"additional","affiliation":[{"name":"LISI Laboratory of Computer Science for Industrial System, INSAT, Carthage University, Tunis, Tunisia"}]},{"given":"Yahya","family":"Slimani","sequence":"additional","affiliation":[{"name":"LISI Laboratory of Computer Science for Industrial System, INSAT, Carthage University, Tunis, Tunisia; Higher Institute of Multimedia Arts of Manouba (ISAMM), Manouba University, Manouba, Tunisia"}]}],"member":"179","published-online":{"date-parts":[[2021,3,29]]},"reference":[{"key":"bibr1-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2011.01.007"},{"key":"bibr2-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2019.05.009"},{"key":"bibr3-0165551521998047","first-page":"119","volume-title":"Proceedings of the seventh international conference on user modeling","author":"Lau T"},{"key":"bibr4-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1145\/2071389.2071390"},{"key":"bibr5-0165551521998047","first-page":"4","volume-title":"Proceedings of the 19th annual international ACM SIGIR conference on research and development in information retrieval","author":"Xu J"},{"key":"bibr6-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2017.09.001"},{"key":"bibr7-0165551521998047","first-page":"688","volume-title":"Proceedings of the 14th ACM international conference on information and knowledge management","author":"Bai J"},{"key":"bibr8-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2019.04.019"},{"key":"bibr9-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1145\/333135.333138"},{"key":"bibr10-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1108\/eb026637"},{"key":"bibr11-0165551521998047","first-page":"206","volume-title":"Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval","author":"Mitra M"},{"key":"bibr12-0165551521998047","first-page":"704","volume-title":"Proceedings of the 14th ACM international conference on information and knowledge management","author":"Collins-Thompson K"},{"key":"bibr13-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199106)42:5<378::AID-ASI8>3.0.CO;2-8"},{"key":"bibr14-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-011-9172-x"},{"key":"bibr15-0165551521998047","first-page":"404","volume-title":"Proceedings of the 2004 conference on empirical methods in natural language processing","author":"Mihalcea R"},{"key":"bibr16-0165551521998047","first-page":"232","volume-title":"Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval","author":"Robertson SE"},{"key":"bibr17-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(02)00018-3"},{"key":"bibr18-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1023\/B:INRT.0000009438.69013.fa"},{"key":"bibr19-0165551521998047","first-page":"70","volume-title":"Proceedings of the Thirteen Text REtrieval Conference (TREC 2004)","author":"Voorhees EM."},{"key":"bibr20-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2006.09.003"},{"key":"bibr21-0165551521998047","first-page":"313","volume-title":"The SMART retrieval system: experiments in automatic document processing","author":"Rocchio JJ.","year":"1971"},{"key":"bibr22-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1108\/eb026683"},{"key":"bibr23-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511809071"},{"key":"bibr24-0165551521998047","first-page":"696","volume-title":"Proceedings of the 14th ACM international conference on information and knowledge management","author":"Fonseca BM"},{"key":"bibr25-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2014.07.004"},{"key":"bibr26-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.114.2.211"},{"key":"bibr27-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1145\/366836.366860"},{"key":"bibr28-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2019.102182"},{"key":"bibr29-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1177\/0165551514533771"},{"key":"bibr30-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2020.102342"},{"key":"bibr31-0165551521998047","first-page":"1483","volume-title":"Proceedings of the 25th ACM international on conference on information and knowledge management","author":"Zamani H"},{"key":"bibr32-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-016-2207-x"},{"key":"bibr33-0165551521998047","first-page":"678","volume-title":"Proceedings of the 33rd annual ACM symposium on applied computing","author":"Valcarce D"},{"key":"bibr34-0165551521998047","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1035"},{"key":"bibr35-0165551521998047","first-page":"505","volume-title":"Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval","author":"Zamani H"},{"key":"bibr36-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1177\/0165551518792210"},{"key":"bibr37-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2019.04.007"},{"key":"bibr38-0165551521998047","first-page":"243","volume-title":"Proceedings of the 31st Annual International ACM SIGIR conference on research and development in information retrieval","author":"Cao G"},{"key":"bibr39-0165551521998047","first-page":"709","volume-title":"38th European conference on information retrieval","author":"ALMasri M"},{"key":"bibr40-0165551521998047","volume-title":"Proceedings of the Twenty-Seventh Text REtrieval conference (TREC 2018)","author":"Aklouche B"},{"key":"bibr41-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2019.12.002"},{"key":"bibr42-0165551521998047","first-page":"3111","volume-title":"Proceedings of the 26th international conference on neural information processing systems","author":"Mikolov T"},{"key":"bibr43-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"bibr44-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1007\/s10844-020-00596-8"},{"key":"bibr45-0165551521998047","first-page":"297","volume-title":"42nd European conference on information retrieval","author":"Padaki R"},{"key":"bibr46-0165551521998047","first-page":"4718","volume-title":"Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP): findings, online event","author":"Zheng Z"},{"key":"bibr47-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1080\/10919392.2018.1517481"},{"key":"bibr48-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1177\/0165551518790424"},{"key":"bibr49-0165551521998047","first-page":"887","volume-title":"Workshops of the 33rd international conference on advanced information networking and applications","author":"Abu-Salih B"},{"key":"bibr50-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-020-0283-3"},{"key":"bibr51-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2894679"},{"key":"bibr52-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630270302"},{"key":"bibr53-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"bibr54-0165551521998047","first-page":"715","volume-title":"Proceedings Joint 9th IFSA world congress and 20th North American fuzzy information processing society (NAFIPS) international conference","author":"Kim BM"},{"key":"bibr55-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1348\/000711000159213"},{"key":"bibr56-0165551521998047","first-page":"296","volume-title":"Proceedings of the 52nd annual meeting of the association for computational linguistics (ACL)","author":"Gawron JM"},{"key":"bibr57-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1108\/eb046814"},{"key":"bibr58-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-016-9498-2"},{"key":"bibr59-0165551521998047","first-page":"65","volume-title":"Proceedings of the 26th International symposium on string processing and information retrieval (SPIRE 2019)","author":"Aklouche B"},{"key":"bibr60-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"bibr61-0165551521998047","volume-title":"Probability models for information retrieval based on divergence from randomness","author":"Amati G.","year":"2003"},{"key":"bibr62-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1145\/1105696.1105699"},{"key":"bibr63-0165551521998047","first-page":"111","volume-title":"Proceedings of the 23rd ACM international conference on information and knowledge management","author":"Huston S"},{"key":"bibr64-0165551521998047","first-page":"58","volume-title":"Proceedings of the 2014 Australasian document computing symposium","author":"Trotman A"},{"key":"bibr65-0165551521998047","volume-title":"Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004)","author":"Plachouras V"},{"key":"bibr66-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"bibr67-0165551521998047","first-page":"837","volume-title":"Proceedings of the 18th ACM conference on information and knowledge management","author":"Collins-Thompson K"},{"key":"bibr68-0165551521998047","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2005.03.025"},{"key":"bibr69-0165551521998047","first-page":"7","volume-title":"Proceedings of the 20th ACM international conference on information and knowledge management","author":"Lv Y"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551521998047","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551521998047","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551521998047","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:09:33Z","timestamp":1777504173000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551521998047"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,29]]},"references-count":69,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,2]]}},"alternative-id":["10.1177\/0165551521998047"],"URL":"https:\/\/doi.org\/10.1177\/0165551521998047","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,29]]}}}