{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T11:02:44Z","timestamp":1769166164335,"version":"3.49.0"},"reference-count":51,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2018,8,9]],"date-time":"2018-08-09T00:00:00Z","timestamp":1533772800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p> Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users\u2019 queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001\/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively. <\/jats:p>","DOI":"10.1177\/0165551518792210","type":"journal-article","created":{"date-parts":[[2018,8,9]],"date-time":"2018-08-09T14:40:40Z","timestamp":1533825640000},"page":"429-442","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":22,"title":["Word-embedding-based pseudo-relevance feedback for Arabic information retrieval"],"prefix":"10.1177","volume":"45","author":[{"given":"Abdelkader","family":"El Mahdaouy","sequence":"first","affiliation":[{"name":"LIM Laboratory, Faculty of Sciences Dhar el Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco"},{"name":"Universit\u00e9 Grenoble Alpes, CNRS, Grenoble INP, LIG, F-38000 Grenoble, France"}]},{"given":"Sa\u00efd Ouatik","family":"El Alaoui","sequence":"additional","affiliation":[{"name":"LIM Laboratory, Faculty of Sciences Dhar el Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco"}]},{"given":"Eric","family":"Gaussier","sequence":"additional","affiliation":[{"name":"Universit\u00e9 Grenoble Alpes, CNRS, Grenoble INP, LIG, F-38000 Grenoble, France"}]}],"member":"179","published-online":{"date-parts":[[2018,8,9]]},"reference":[{"key":"bibr1-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1145\/366836.366860"},{"key":"bibr2-0165551518792210","first-page":"837","volume-title":"Proceedings of the 18th ACM conference on information and knowledge management","author":"Collins-Thompson K."},{"key":"bibr3-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1145\/2071389.2071390"},{"key":"bibr4-0165551518792210","first-page":"6","volume-title":"Proceedings of the 2013 conference on the theory of information retrieval","author":"Clinchant S"},{"key":"bibr5-0165551518792210","doi-asserted-by":"crossref","unstructured":"Almasri M, Berrut C, Chevallet J. A comparison of deep learning based query expansion with pseudo-relevance feedback and mutual information. In: Proceedings of the advances in information retrieval \u2013 38th European conference on IR research, ECIR 2016, Padua, 20\u201323 March 2016, pp. 709\u2013715, https:\/\/hal.archives-ouvertes.fr\/hal-01576603\/document","DOI":"10.1007\/978-3-319-30671-1_57"},{"key":"bibr6-0165551518792210","doi-asserted-by":"crossref","unstructured":"Lavrenko V, Croft WB. Relevance-based language models. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, LA, 9\u201312 September 2001, pp. 120\u2013127. New York: ACM.","DOI":"10.1145\/383952.383972"},{"key":"bibr7-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582416"},{"key":"bibr8-0165551518792210","doi-asserted-by":"crossref","unstructured":"Fang H, Zhai C. Semantic term matching in axiomatic approaches to information retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, WA, 6\u201311 August 2006, pp. 115\u2013122. New York: ACM.","DOI":"10.1145\/1148170.1148193"},{"key":"bibr9-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1177\/0165551515594722"},{"key":"bibr10-0165551518792210","doi-asserted-by":"crossref","unstructured":"Montazeralghaem A, Zamani H, Shakery A. Axiomatic analysis for improving the log-logistic feedback model. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval, Pisa, 17\u201321 July 2016, pp. 765\u2013768. New York: ACM.","DOI":"10.1145\/2911451.2914768"},{"key":"bibr11-0165551518792210","volume-title":"Efficient estimation of word representations in vector space","author":"Mikolov T","year":"2013"},{"key":"bibr12-0165551518792210","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language, pp. 1532\u20131543. Doha, Qatar: Association for Computational Linguistics, https:\/\/www.aclweb.org\/anthology\/D14-1162","DOI":"10.3115\/v1\/D14-1162"},{"key":"bibr13-0165551518792210","doi-asserted-by":"crossref","unstructured":"Baroni M, Dinu G, Kruszewski G. Don\u2019t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Baltimore, MD, 23\u201325 June 2014, pp. 238\u2013247. Baltimore, Maryland: ACL, http:\/\/www.aclweb.org\/anthology\/P14-1023","DOI":"10.3115\/v1\/P14-1023"},{"issue":"3","key":"bibr14-0165551518792210","first-page":"493","volume":"10","author":"Lofi C.","year":"2015","journal-title":"Inform Media Technol"},{"key":"bibr15-0165551518792210","doi-asserted-by":"crossref","unstructured":"Ganguly D, Roy D, Mitra M, et al. Word embedding based generalized language model for information retrieval. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, Santiago, 9\u201313 August 2015, pp. 795\u2013798. New York: ACM.","DOI":"10.1145\/2766462.2767780"},{"key":"bibr16-0165551518792210","first-page":"363","volume-title":"Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval","author":"Vuli\u0107 I"},{"key":"bibr17-0165551518792210","doi-asserted-by":"crossref","unstructured":"Zuccon G, Koopman B, Bruza P, et al. Integrating and evaluating neural word embeddings in information retrieval. In: Proceedings of the 20th Australasian document computing symposium, Parramatta, 8\u20139 December 2015, pp. 1\u201312. New York: ACM.","DOI":"10.1145\/2838931.2838936"},{"key":"bibr18-0165551518792210","first-page":"385","volume-title":"Proceedings of the 2016 4th IEEE international colloquium on information science and technology (CiSt)","author":"El Mahdaouy A"},{"key":"bibr19-0165551518792210","doi-asserted-by":"crossref","unstructured":"Kuzi S, Shtok A, Kurland O. Query expansion using word embeddings. In: Proceedings of the 25th ACM international on conference on information and knowledge management, Indianapolis, IN, 24\u201328 October 2016, pp. 1929\u20131932. New York: ACM.","DOI":"10.1145\/2983323.2983876"},{"key":"bibr20-0165551518792210","doi-asserted-by":"crossref","unstructured":"Diaz F, Mitra B, Craswell N. Query expansion with locally-trained word embeddings, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7\u201312 August 2016. The Association for Computer Linguistics. http:\/\/www.aclweb.org\/anthology\/P16-1035","DOI":"10.18653\/v1\/P16-1035"},{"key":"bibr21-0165551518792210","doi-asserted-by":"crossref","unstructured":"Larkey LS, Ballesteros L, Connell ME. Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, Tampere, 11\u201315 August 2002, pp. 275\u2013282. New York: ACM.","DOI":"10.1145\/564376.564425"},{"key":"bibr22-0165551518792210","first-page":"68","volume-title":"Proceedings of the international conference at the British computer society challenge of Arabic for NLP\/MT","author":"Kadri Y","year":"2006"},{"key":"bibr23-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1002\/aris.2007.1440410118"},{"key":"bibr24-0165551518792210","first-page":"1803","volume-title":"Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management","author":"Algarni M"},{"key":"bibr25-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1177\/0165551515585720"},{"key":"bibr26-0165551518792210","doi-asserted-by":"crossref","unstructured":"Abdelali A, Darwish K, Durrani N, et al. Farasa: a fast and furious segmenter for Arabic. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics demonstrations session, 12\u201317 June 2016, pp. 11\u201316. San Diego CA: Human Language Technologies, http:\/\/www.aclweb.org\/anthology\/N16-3003","DOI":"10.18653\/v1\/N16-3003"},{"issue":"4","key":"bibr27-0165551518792210","first-page":"14","volume":"4","author":"Guirat SB","year":"2016","journal-title":"Int J Softw Innov"},{"key":"bibr28-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1007\/11575832_23"},{"key":"bibr29-0165551518792210","first-page":"1070","volume-title":"Proceedings of the tenth international conference on language resources and evaluation LREC 2016","author":"Darwish K"},{"key":"bibr30-0165551518792210","doi-asserted-by":"crossref","unstructured":"Shaalan K, Al-Sheikh S, Oroumchian F. Query expansion based-on similarity of terms for improving Arabic information retrieval. In: Proceedings of the international conference on intelligent information processing, 2012, pp. 167\u2013176. Berlin: Springer.","DOI":"10.1007\/978-3-642-32891-6_22"},{"key":"bibr31-0165551518792210","doi-asserted-by":"crossref","unstructured":"Mahgoub AY, Rashwan MA, Raafat H, et al. Semantic query expansion for Arabic information retrieval. In: Proceedings of the Arabic natural language processing workshop, 2014, pp. 87\u201392.","DOI":"10.3115\/v1\/W14-3611"},{"issue":"3","key":"bibr32-0165551518792210","first-page":"54","volume":"4","author":"Belalem G","year":"2014","journal-title":"Int J Inform Retr"},{"key":"bibr33-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-018-9492-y"},{"key":"bibr34-0165551518792210","first-page":"281","volume-title":"Proceedings of the 34th European conference on IR research, ECIR 2012, lecture notes in computer science (LNCS)","volume":"7224","author":"Li B"},{"key":"bibr35-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-015-9307-3"},{"key":"bibr36-0165551518792210","doi-asserted-by":"crossref","unstructured":"Zamani H, Croft WB. Embedding-based query language models. In: Proceedings of the 2016 ACM international conference on the theory of information retrieval, Newark, DE, 12\u201316 September 2016, pp. 147\u2013156. New York: ACM.","DOI":"10.1145\/2970398.2970405"},{"key":"bibr37-0165551518792210","volume-title":"Proceedings of the 16th international conference part I, lecture notes in computer science, CICLing 2015","volume":"9041","author":"Zahran MA"},{"key":"bibr38-0165551518792210","doi-asserted-by":"crossref","unstructured":"Zamani H, Dadashkarimi J, Shakery A, et al. Pseudo-relevance feedback based on matrix factorization. In: Proceedings of the 25th ACM international on conference on information and knowledge management, Indianapolis, IN, 24\u201328 October 2016, pp. 1483\u20131492. New York: ACM.","DOI":"10.1145\/2983323.2983844"},{"key":"bibr39-0165551518792210","first-page":"234","volume-title":"Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Clinchant S"},{"key":"bibr40-0165551518792210","doi-asserted-by":"crossref","unstructured":"Ponte JM, Croft WB. A language modeling approach to information retrieval. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, Melbourne, 24\u201328 August 1998, pp. 275\u2013281. New York: ACM.","DOI":"10.1145\/290941.291008"},{"key":"bibr41-0165551518792210","doi-asserted-by":"crossref","unstructured":"Zhai C, Lafferty J. A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, LA, 9\u201313 September 2001, pp. 334\u2013342. New York: ACM.","DOI":"10.1145\/383952.384019"},{"key":"bibr42-0165551518792210","unstructured":"Robertson SE, Walker S, Jones S, et al. Okapi at TREC-3. In: Proceedings of the TREC\u201994, pp. 109\u2013126."},{"key":"bibr43-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4020-6046-5_12"},{"key":"bibr44-0165551518792210","volume-title":"Stemming Arabic text","author":"Khoja S"},{"key":"bibr45-0165551518792210","first-page":"L14","volume-title":"Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC\u201914)","author":"Pasha A","year":"2014"},{"key":"bibr46-0165551518792210","volume-title":"Build fast and accurate lemmatization for Arabic","author":"Mubarak H.","year":"2017"},{"key":"bibr47-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1177\/0165551514558172"},{"key":"bibr48-0165551518792210","doi-asserted-by":"publisher","DOI":"10.1177\/0165551515625030"},{"key":"bibr49-0165551518792210","first-page":"1","volume-title":"Proceedings of the 2017 intelligent systems and computer vision (ISCV)","author":"Zeroual I"},{"key":"bibr50-0165551518792210","first-page":"29","volume-title":"Proceedings of the 2015 first international conference on Arabic computational linguistics (ACLing)","author":"Jaafar Y"},{"key":"bibr51-0165551518792210","first-page":"272","volume-title":"Proceedings of the 2014 third IEEE international colloquium in information science and technology (CIST)","author":"El Mahdaouy A"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551518792210","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551518792210","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551518792210","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T08:37:13Z","timestamp":1740818233000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551518792210"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,8,9]]},"references-count":51,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,8]]}},"alternative-id":["10.1177\/0165551518792210"],"URL":"https:\/\/doi.org\/10.1177\/0165551518792210","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,8,9]]}}}