{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T06:10:06Z","timestamp":1755843006553,"version":"3.44.0"},"publisher-location":"New York, NY, USA","reference-count":20,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,7,10]],"date-time":"2024-07-10T00:00:00Z","timestamp":1720569600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,7,10]]},"DOI":"10.1145\/3626772.3657952","type":"proceedings-article","created":{"date-parts":[[2024,7,11]],"date-time":"2024-07-11T12:40:05Z","timestamp":1720701605000},"page":"2564-2568","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["On Backbones and Training Regimes for Dense Retrieval in African Languages"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-2630-8167","authenticated-orcid":false,"given":"Akintunde","family":"Oladipo","sequence":"first","affiliation":[{"name":"University of Waterloo, Waterloo, Ontario, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-2859-7136","authenticated-orcid":false,"given":"Mofetoluwa","family":"Adeyemi","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, Ontario, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0661-7189","authenticated-orcid":false,"given":"Jimmy","family":"Lin","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, Ontario, Canada"}]}],"member":"320","published-online":{"date-parts":[[2024,7,11]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"David Adelani Graham Neubig Sebastian Ruder Shruti Rijhwani Michael Beukman Chester Palen-Michel Constantine Lignos Jesujoba Alabi Shamsuddeen Muhammad Peter Nabende Cheikh M. Bamba Dione Andiswa Bukula Rooweither Mabuya Bonaventure F. P. Dossou Blessing Sibanda Happy Buzaaba Jonathan Mukiibi Godson Kalipe Derguene Mbaye Amelia Taylor Fatoumata Kabore Chris Chinenye Emezue Anuoluwapo Aremu Perez Ogayo Catherine Gitau Edwin Munkoh-Buabeng Victoire Memdjokam Koagne Allahsera Auguste Tapo Tebogo Macucwa Vukosi Marivate Mboning Tchiaze Elvis Tajuddeen Gwadabe Tosin Adewumi Orevaoghene Ahia Joyce Nakatumba-Nabende Neo Lerato Mokono Ignatius Ezeani Chiamaka Chukwuneke Mofetoluwa Oluwaseun Adeyemi Gilles Quentin Hacheme Idris Abdulmumin Odunayo Ogundepo Oreen Yousuf Tatiana Moteu and Dietrich Klakow. 2022. MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing Yoav Goldberg Zornitsa Kozareva and Yue Zhang (Eds.). Association for Computational Linguistics Abu Dhabi United Arab Emirates 4488--4508.","DOI":"10.18653\/v1\/2022.emnlp-main.298"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"David Ifeoluwa Adelani Marek Masiak Israel Abebe Azime Jesujoba Alabi Atnafu Lambebo Tonja Christine Mwase Odunayo Ogundepo Bonaventure F. P. Dossou Akintunde Oladipo Doreen Nixdorf Chris Chinenye Emezue Sana Al-azzawi Blessing Sibanda Davis David Lolwethu Ndolela Jonathan Mukiibi Tunde Ajayi Tatiana Moteu Brian Odhiambo Abraham Owodunni Nnaemeka Obiefuna Muhidin Mohamed Shamsuddeen Hassan Muhammad Teshome Mulugeta Ababu Saheed Abdullahi Salahudeen Mesay Gemeda Yigezu Tajuddeen Gwadabe Idris Abdulmumin Mahlet Taye Oluwabusayo Awoyomi Iyanuoluwa Shode Tolulope Adelani Habiba Abdulganiyu Abdul-Hakeem Omotayo Adetola Adeeko Abeeb Afolabi Anuoluwapo Aremu Olanrewaju Samuel Clemencia Siro Wangari Kimotho Onyekachi Ogbu Chinedu Mbonu Chiamaka Chukwuneke Samuel Fanijo Jessica Ojo Oyinkansola Awosan Tadesse Kebede Toadoum Sari Sakayo Pamela Nyatsine Freedmore Sidume Oreen Yousuf Mardiyyah Oduwole Kanda Tshinu Ussen Kimanuka Thina Diko Siyanda Nxakama Sinodos Nigusse Abdulmejid Johar Shafie Mohamed Fuad Mire Hassan Moges Ahmed Mehamed Evrard Ngabire Jules Jules Ivan Ssenkungu and Pontus Stenetorp. 2023. MasakhaNEWS: News Topic Classification for African languages. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers) Jong C. Park Yuki Arase Baotian Hu Wei Lu Derry Wijaya Ayu Purwarianti and Adila Alfa Krisnadhi (Eds.). Association for Computational Linguistics Nusa Dua Bali 144--159.","DOI":"10.18653\/v1\/2023.ijcnlp-main.10"},{"key":"e_1_3_2_1_3_1","volume-title":"Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (FIRE '23)","author":"Adeyemi Mofetoluwa","year":"2024","unstructured":"Mofetoluwa Adeyemi, Akintunde Oladipo, Xinyu Zhang, David Alfonso-Hermelo, Mehdi Rezagholizadeh, Boxing Chen, and Jimmy Lin. 2024. CIRAL at FIRE 2023: Cross-Lingual Information Retrieval for African Languages. In Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (FIRE '23). Association for Computing Machinery, New York, NY, USA, 4--6."},{"key":"e_1_3_2_1_4_1","volume-title":"Marius Mosbach, and Dietrich Klakow.","author":"Alabi Jesujoba O.","year":"2022","unstructured":"Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, and Dietrich Klakow. 2022. Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning. In Proceedings of the 29th International Conference on Computational Linguistics, Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, and Seung-Hoon Na (Eds.). International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 4336--4349."},{"key":"e_1_3_2_1_5_1","volume-title":"Sentiment Analysis Across Multiple African Languages: A Current Benchmark. arXiv e-prints (Oct","author":"Aryal Saurav K.","year":"2023","unstructured":"Saurav K. Aryal, Howard Prioleau, and Surakshya Aryal. 2023. Sentiment Analysis Across Multiple African Languages: A Current Benchmark. arXiv e-prints (Oct. 2023). arxiv: 2310.14120 [cs.CL]"},{"volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Conneau Alexis","key":"e_1_3_2_1_6_1","unstructured":"Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzm\u00e1n, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 8440--8451."},{"key":"e_1_3_2_1_7_1","unstructured":"Cheikh M. Bamba Dione David Ifeoluwa Adelani Peter Nabende Jesujoba Alabi Thapelo Sindane Happy Buzaaba Shamsuddeen Hassan Muhammad Chris Chinenye Emezue Perez Ogayo Anuoluwapo Aremu Catherine Gitau Derguene Mbaye Jonathan Mukiibi Blessing Sibanda Bonaventure F. P. Dossou Andiswa Bukula Rooweither Mabuya Allahsera Auguste Tapo Edwin Munkoh-Buabeng Victoire Memdjokam Koagne Fatoumata Ouoba Kabore Amelia Taylor Godson Kalipe Tebogo Macucwa Vukosi Marivate Tajuddeen Gwadabe Mboning Tchiaze Elvis Ikechukwu Onyenwe Gratien Atindogbe Tolulope Adelani Idris Akinade Olanrewaju Samuel Marien Nahimana Th\u00e9og\u00e8ne Musabeyezu Emile Niyomutabazi Ester Chimhenga Kudzai Gotosa Patrick Mizha Apelete Agbolo Seydou Traore Chinedu Uchechukwu Aliyu Yusuf Muhammad Abdullahi and Dietrich Klakow. 2023. MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African languages. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Anna Rogers Jordan Boyd-Graber and Naoaki Okazaki (Eds.). Association for Computational Linguistics Toronto Canada 10883--10900."},{"volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP),","author":"Karpukhin Vladimir","key":"e_1_3_2_1_8_1","unstructured":"Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP),, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 6769--6781."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401075"},{"volume-title":"Proceedings of the 1st Workshop on Multilingual Representation Learning","author":"Ogueji Kelechi","key":"e_1_3_2_1_10_1","unstructured":"Kelechi Ogueji, Yuxin Zhu, and Jimmy Lin. 2021. Small Data? No Problem! Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages. In Proceedings of the 1st Workshop on Multilingual Representation Learning, Duygu Ataman, Alexandra Birch, Alexis Conneau, Orhan Firat, Sebastian Ruder, and Gozde Gul Sahin (Eds.). Association for Computational Linguistics, Punta Cana, Dominican Republic, 116--126."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.997"},{"volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing,","author":"Ogundepo Odunayo","key":"e_1_3_2_1_12_1","unstructured":"Odunayo Ogundepo, Xinyu Zhang, Shuo Sun, Kevin Duh, and Jimmy Lin. 2022. AfriCLIRMatrix: Enabling Cross-Lingual Information Retrieval for African Languages. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing,, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 8721--8728."},{"volume-title":"Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching,","author":"Ogunremi Tolulope","key":"e_1_3_2_1_13_1","unstructured":"Tolulope Ogunremi, Christopher Manning, and Dan Jurafsky. 2023. Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching. In Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching,, Genta Winata, Sudipta Kar, Marina Zhukova, Thamar Solorio, Mona Diab, Sunayana Sitaram, Monojit Choudhury, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 83--88."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-main.11"},{"key":"e_1_3_2_1_15_1","volume-title":"Online Multilingualism in African Written Conversations. Studies in African Linguistics","author":"P\u00e9rez-Sabater Carmen","year":"2020","unstructured":"Carmen P\u00e9rez-Sabater and Ginette Maguelouk-Moffo. 2020. Online Multilingualism in African Written Conversations. Studies in African Linguistics (2020)."},{"key":"e_1_3_2_1_16_1","volume-title":"Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In 9th International Conference on Learning Representations, ICLR 2021","author":"Xiong Lee","year":"2021","unstructured":"Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2021. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net."},{"volume-title":"Proceedings of the 2021 Conference of the North American","author":"Xue Linting","key":"e_1_3_2_1_17_1","unstructured":"Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 483--498."},{"volume-title":"Proceedings of the 1st Workshop on Multilingual Representation Learning","author":"Zhang Xinyu","key":"e_1_3_2_1_18_1","unstructured":"Xinyu Zhang, Xueguang Ma, Peng Shi, and Jimmy Lin. 2021. Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval. In Proceedings of the 1st Workshop on Multilingual Representation Learning, Duygu Ataman, Alexandra Birch, Alexis Conneau, Orhan Firat, Sebastian Ruder, and Gozde Gul Sahin (Eds.). Association for Computational Linguistics, Punta Cana, Dominican Republic, 127--137."},{"key":"e_1_3_2_1_19_1","article-title":"Toward Best Practices for Training Multilingual Dense Retrieval Models","volume":"42","author":"Zhang Xinyu","year":"2023","unstructured":"Xinyu Zhang, Kelechi Ogueji, Xueguang Ma, and Jimmy Lin. 2023. Toward Best Practices for Training Multilingual Dense Retrieval Models. ACM Trans. Inf. Syst., Vol. 42, 2, Article 39 (sep 2023), 33 pages.","journal-title":"ACM Trans. Inf. Syst."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00595"}],"event":{"name":"SIGIR 2024: The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Washington DC USA","acronym":"SIGIR 2024"},"container-title":["Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626772.3657952","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3626772.3657952","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T05:39:15Z","timestamp":1755841155000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626772.3657952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,10]]},"references-count":20,"alternative-id":["10.1145\/3626772.3657952","10.1145\/3626772"],"URL":"https:\/\/doi.org\/10.1145\/3626772.3657952","relation":{},"subject":[],"published":{"date-parts":[[2024,7,10]]},"assertion":[{"value":"2024-07-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}