{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T14:14:33Z","timestamp":1764080073238,"version":"3.41.0"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2025,3,11]],"date-time":"2025-03-11T00:00:00Z","timestamp":1741651200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2025,3,31]]},"abstract":"<jats:p>After the increased reliance on online education, online assessment became an essential tool for educators to remotely monitor and evaluate students\u2019 understanding in order to assist them properly. However, the laborious process of creating exam questions is a challenge for most teachers. Thus, automated Question Generation aims to assist teachers by generating questions from given data. Limited research has been conducted to tackle this issue in the Arabic Language due to the complexity of the language and the limited amount of available Arabic data. This article explores different implementations of the transformer models, that demonstrated their superiority in natural language processing. Three approaches were introduced to tackle this problem with Arabic data using an Arabic-based transformer, an English-based transformer, and a multilingual-based transformer. Each of the fine-tuned models was trained using ARCD, XGLUE, DialectBench, and ArabicQA data sets and evaluated on automatic and manual metrics. Two of the proposed models achieve state-of-the-art results on the Arabic question generation task. The English transformer obtained a ROUGE score of 0.59 on XGLUE, while the Arabic transformer model achieves 0.49 on ARCD. Both of these models demonstrate excellent quality of questions through human-conducted evaluations by achieving low WER and high GC, U, and A scores.<\/jats:p>","DOI":"10.1145\/3701559","type":"journal-article","created":{"date-parts":[[2024,10,28]],"date-time":"2024-10-28T09:49:20Z","timestamp":1730108960000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Arabic Question Generation Using Transformers"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4795-5915","authenticated-orcid":false,"given":"Anwar","family":"Alajmi","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, College of Engineering and Petroleum, Kuwait University, Kuwait, Kuwait"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6941-4365","authenticated-orcid":false,"given":"Haniah","family":"Altabaa","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, College of Engineering and Petroleum, Kuwait University, Kuwait, Kuwait"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1849-9316","authenticated-orcid":false,"given":"Sa\u2019ed","family":"Abed","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, College of Engineering and Petroleum, Kuwait University, Kuwait, Kuwait"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0673-7324","authenticated-orcid":false,"given":"Imtiaz","family":"Ahmad","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, College of Engineering and Petroleum, Kuwait University, Kuwait, Kuwait"}]}],"member":"320","published-online":{"date-parts":[[2025,3,11]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","unstructured":"Abdelrahman Abdallah Mahmoud Kasem Mahmoud Abdalla Mohamed Mahmoud Mohamed Elkasaby Yasser Elbendary and Adam Jatowt. 2024. ArabicaQA: A comprehensive dataset for arabic question answering. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (Washington DC USA) (SIGIR\u201924). Association for Computing Machinery New York NY USA 2049\u20132059. 10.1145\/3626772.3657889","DOI":"10.1145\/3626772.3657889"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13369-019-04024-0"},{"key":"e_1_3_2_4_2","volume-title":"Arabic Automatic Question Generation Using Transformer Model","author":"Alhashedi Saleh","year":"2022","unstructured":"Saleh Alhashedi, Norhaida Mohd Suaib, and Aryati Bakri. 2022. Arabic Automatic Question Generation Using Transformer Model. Technical Report. EasyChair."},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-021-10031-1"},{"key":"e_1_3_2_6_2","first-page":"196","volume-title":"Proceedings of the 6th Arabic Natural Language Processing Workshop","author":"Antoun Wissam","year":"2021","unstructured":"Wissam Antoun, Fady Baly, and Hazem Hajj. 2021. AraGPT2: Pre-trained transformer for Arabic language generation. In Proceedings of the 6th Arabic Natural Language Processing Workshop. 196\u2013207."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","unstructured":"Lucas Bandarkar Davis Liang Benjamin Muller Mikel Artetxe Satya Narayan Shukla Donald Husa Naman Goyal Abhinandan Krishnan Luke Zettlemoyer and Madian Khabsa. 2024. The belebele benchmark: A parallel reading comprehension dataset in 122 language variants. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics) Lun-Wei Ku Andre Martins and Vivek Srikumar (Eds.). Association for Computational Linguistics Bangkok Thailand 749\u2013775. 10.18653\/v1\/2024.acl-long.44","DOI":"10.18653\/v1\/2024.acl-long.44"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1080\/03091902.2022.2097327"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18280\/ria.340606"},{"key":"e_1_3_2_10_2","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver BC Canada) (NIPS\u201920). Curran Associates Inc. Red Hook NY USA Article 159 25 pages."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1177\/0265532210364405"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.54097\/ajst.v2i1.638"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1207\/S15328023TOP2702_01"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1186\/s41039-021-00151-1"},{"key":"e_1_3_2_15_2","first-page":"628","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Elmadany AbdelRahim","year":"2022","unstructured":"AbdelRahim Elmadany, Muhammad Abdul-Mageed, et\u00a0al. 2022. AraT5: Text-to-text transformers for Arabic language generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 628\u2013647."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","unstructured":"AbdelRahim Elmadany El Moatez Billah Nagoudi and Muhammad Abdul-Mageed. 2023. Octopus: A multitask model and toolkit for arabic natural language generation. In Proceedings of ArabicNLP 2023 Hassan Sawaf Samhaa El-Beltagy Wajdi Zaghouani Walid Magdy Ahmed Abdelali Nadi Tomeh Ibrahim Abu Farha Nizar Habash Salam Khalifa Amr Keleg Hatem Haddad Imed Zitouni Khalil Mrini and Rawan Almatham (Eds.). Association for Computational Linguistics Singapore (Hybrid) 232\u2013243. 10.18653\/v1\/2023.arabicnlp-1.20","DOI":"10.18653\/v1\/2023.arabicnlp-1.20"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","unstructured":"Fahim Faisal Orevaoghene Ahia Aarohi Srivastava Kabir Ahuja David Chiang Yulia Tsvetkov and Antonios Anastasopoulos. 2024. DIALECTBENCH: A NLP benchmark for dialects varieties and closely-related languages. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku Andre Martins and Vivek Srikumar (Eds.). Association for Computational Linguistics Bangkok Thailand 14412\u201314454. 10.18653\/v1\/2024.acl-long.777","DOI":"10.18653\/v1\/2024.acl-long.777"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1037\/h0031619"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","unstructured":"Abbas Ghaddar Yimeng Wu Sunyam Bagga Ahmad Rashid Khalil Bibi Mehdi Rezagholizadeh Chao Xing Yasheng Wang Duan Xinyu Zhefeng Wang Baoxing Huai Xin Jiang Qun Liu and Philippe Langlais. 2022. Revisiting pre-trained language models and their evaluation for arabic natural language processing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing Yoav Goldberg Zornitsa Kozareva and Yue Zhang (Eds.). Association for Computational Linguistics Abu Dhabi United Arab Emirates 3135\u20133151. 10.18653\/v1\/2022.emnlp-main.205","DOI":"10.18653\/v1\/2022.emnlp-main.205"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","unstructured":"Bilal Ghanem Lauren Lutz Coleman Julia Rivard Dexter Spencer McIntosh von der Ohe and Alona Fyshe. 2022. Question generation for reading comprehension assessment by modeling how and what to ask. In Findings of the Association for Computational Linguistics: ACL 2022 Smaranda Muresan Preslav Nakov and Aline Villavicencio (Eds.). Association for Computational Linguistics Dublin Ireland 2131\u20132146. 10.18653\/v1\/2022.findings-acl.168","DOI":"10.18653\/v1\/2022.findings-acl.168"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.5555\/3176748.3176757"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.3115\/1118637.1118644"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-06748-3"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","unstructured":"Lifu Huang Ronan Le Bras Chandra Bhagavatula and Yejin Choi. 2019. Cosmos QA: Machine reading comprehension with contextual commonsense reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) Kentaro Inui Jing Jiang Vincent Ng and Xiaojun Wan (Eds.). Association for Computational Linguistics Hong Kong China 2391\u20132401. 10.18653\/v1\/D19-124","DOI":"10.18653\/v1\/D19-124"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.3390\/app13020903"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.3390\/e23111449"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.3390\/e24111514"},{"key":"e_1_3_2_28_2","unstructured":"Benny G. Johnson Jeffrey S. Dittel and Rachel Van Campenhout. 2022. Parallel construction: A parallel corpus approach for automatic question generation in non-english languages. https:\/\/intextbooks.science.uu.nl\/workshop2022\/files\/itb22_p5_short9847.pdf"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","unstructured":"Md Tawkat Islam Khondaker Abdul Waheed El Moatez Billah Nagoudi and Muhammad Abdul-Mageed. 2023. GPTAraEval: A comprehensive evaluation of ChatGPT on Arabic NLP. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Houda Bouamor Juan Pino and Kalika Bali (Eds.). Association for Computational Linguistics Singapore 220\u2013247. 10.18653\/v1\/2023.emnlp-main.16","DOI":"10.18653\/v1\/2023.emnlp-main.16"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.1909.05017"},{"key":"e_1_3_2_31_2","first-page":"6301","volume-title":"Proceedings of the 29th International Conference on Computational Linguistics","author":"Lee Seungyeon","year":"2022","unstructured":"Seungyeon Lee and Minho Lee. 2022. Type-dependent prompt CycleQAG: Cycle consistency for multi-hop question generation. In Proceedings of the 29th International Conference on Computational Linguistics. 6301\u20136314."},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.484"},{"key":"e_1_3_2_33_2","first-page":"74","volume-title":"Text Summarization Branches Out","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 74\u201381."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380270"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","unstructured":"Luis Enrico Lopez Diane Kathryn Cruz Jan Christian Blaise Cruz and Charibeth Cheng. 2020. Simplifying paragraph-level question generation via transformer language models. (2020). DOI:10.48550\/ARXIV.2005.01107","DOI":"10.48550\/ARXIV.2005.01107"},{"key":"e_1_3_2_36_2","unstructured":"Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In 7th International Conference on Learning Representations ICLR 2019 New Orleans LA USA May 6-9 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=Bkg6RiCqY7"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.mrqa-1.4"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-4612"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.findings-naacl.111"},{"key":"e_1_3_2_40_2","article-title":"Improving language understanding by generative pre-training","author":"Radford Alec","year":"2018","unstructured":"Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. OpenAI (2018).","journal-title":"OpenAI"},{"issue":"140","key":"e_1_3_2_41_2","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer.","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 140 (2020), 1\u201367.","journal-title":"The Journal of Machine Learning Research"},{"key":"e_1_3_2_42_2","article-title":"Automatic generation system of essay questions from arabic texts","volume":"2","author":"Saad Abeer M.","year":"2017","unstructured":"Abeer M. Saad and Doaa M. Hawa. 2017. Automatic generation system of essay questions from arabic texts. EPRA International Journal of Research and Development 2, 5 (2017).","journal-title":"EPRA International Journal of Research and Development"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1604"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-short.88"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.416"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-14647-4_6"},{"key":"e_1_3_2_47_2","volume-title":"Proceedings of the 22nd Annual Conference of the European Association for Machine Translation","author":"Tiedemann J\u00f6rg","year":"2020","unstructured":"J\u00f6rg Tiedemann and Santhosh Thottingal. 2020. OPUS-MT\u2013Building open translation services for the World. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. European Association for Machine Translation."},{"key":"e_1_3_2_48_2","article-title":"The Llama 3 herd of models","author":"Touvron Hugo","year":"2024","unstructured":"Hugo Touvron, Thibaut Lavril, and Gautier Izacard. 2024. The Llama 3 herd of models. arXiv:2409.01234. Retrieved from https:\/\/arxiv.org\/abs\/2409.01234","journal-title":"arXiv:2409.01234"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aei.2019.101027"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00438"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","unstructured":"Asahi Ushio Fernando Alva-Manchego and Jose Camacho-Collados. 2022. Generative language models for paragraph-level question generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing Yoav Goldberg Zornitsa Kozareva and Yue Zhang (Eds.). Association for Computational Linguistics Abu Dhabi United Arab Emirates 670\u2013688. 10.18653\/v1\/2022.emnlp-main.42","DOI":"10.18653\/v1\/2022.emnlp-main.42"},{"key":"e_1_3_2_52_2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones \u0141ukasz Gomez Aidan Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach California USA) (NIPS\u201917). Curran Associates Inc. Red Hook NY USA 6000\u20136010."},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.41"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3558100.3563846"},{"issue":"1","key":"e_1_3_2_55_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3468889","article-title":"A review on question generation from natural language text","volume":"40","author":"Zhang Ruqing","year":"2021","unstructured":"Ruqing Zhang, Jiafeng Guo, Lu Chen, Yixing Fan, and Xueqi Cheng. 2021. A review on question generation from natural language text. ACM Transactions on Information Systems 40, 1 (2021), 1\u201343.","journal-title":"ACM Transactions on Information Systems"},{"key":"e_1_3_2_56_2","article-title":"Transformer-xh: Multi-evidence reasoning with extra hop attention","author":"Zhao Chen","year":"2020","unstructured":"Chen Zhao, Chenyan Xiong, Corby Rosset, Xia Song, Paul Bennett, and Saurabh Tiwary. 2020. Transformer-xh: Multi-evidence reasoning with extra hop attention. In Proceedings of the International Conference on Learning Representations ICLR (2020).","journal-title":"In Proceedings of the International Conference on Learning Representations ICLR"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3701559","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3701559","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:00Z","timestamp":1750295880000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3701559"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,11]]},"references-count":55,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3,31]]}},"alternative-id":["10.1145\/3701559"],"URL":"https:\/\/doi.org\/10.1145\/3701559","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2025,3,11]]},"assertion":[{"value":"2023-09-21","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-16","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}