{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T05:56:03Z","timestamp":1772862963955,"version":"3.50.1"},"reference-count":69,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,11,23]],"date-time":"2025-11-23T00:00:00Z","timestamp":1763856000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T00:00:00Z","timestamp":1766966400000},"content-version":"vor","delay-in-days":36,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Discov Artif Intell"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Recent studies emphasize online health information quality. However, little research focuses on Arabic content for emergency conditions like heart disease, hypertension, and stroke despite the high demand for this information. This study aims to address this gap by using artificial intelligence to create a benchmark dataset for Arabic health information and evaluate its quality.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>We assessed the quality of health information across three criteria: source quality, treatment quality, and content trustworthiness. The Kruskal-Wallis test was used to analyze quality differences across content types (General Health Information, Medical Advice, Treatment Description) and website categories (Government, Journalistic, Portal, Professional). Data augmentation techniques such as paraphrasing, back translation, and RandAugment were also employed to enhance quality assessment using the Arabic BERT model. The study also proposes a novel architecture termed the Mixture of Classification. In this approach, each health document is processed in parallel by three instances of an Arabic BERT model: the first identifies the type of health information, the second determines the category of the provider, and the third estimates a continuous quality score via regression.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>\n                      Significant quality differences were observed among website categories and content types. Portal sites achieved the highest mean score (11.64), while Journalistic sites scored the lowest (3.33). Treatment Descriptions had the highest score (18.74), while Medical Advice scored the lowest (4.73). These differences were statistically significant with large effect sizes (Cohen\u2019s\n                      <jats:inline-formula>\n                        <jats:tex-math>$$f = 0.409$$<\/jats:tex-math>\n                      <\/jats:inline-formula>\n                      for categories;\n                      <jats:inline-formula>\n                        <jats:tex-math>$$f = 0.288$$<\/jats:tex-math>\n                      <\/jats:inline-formula>\n                      for content types), indicating substantial practical impact. Paraphrasing showed the best performance, with 89% accuracy, 85% F1 score, 76% precision, and 82% recall with high confidence and minimal class imbalance. A notable 36.3% accuracy gap was identified between low-quality (95.01%) and high-quality (58.71%) content using RandAugment data. These findings highlight the importance of content type, provider category, and quality score for improving health information search rankings.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>Content type, provider category, and quality score are key factors in enhancing the ranking of Arabic health information. Paraphrased data augmentation contributes to improved model reliability in distinguishing between quality classes. Future research should extend this approach to other languages and health-related topics. However, a major challenge remains: achieving a balanced dataset, particularly for binary classification between high- and low-quality content, as well as for the new classification based on provider category, content type, and quality score. The goal is to ensure an equal distribution across all these categories.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1007\/s44163-025-00679-x","type":"journal-article","created":{"date-parts":[[2025,11,23]],"date-time":"2025-11-23T05:44:01Z","timestamp":1763876641000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Artificial intelligence for automating the establishment of an Arabic benchmark dataset for enhancing health information quality assessment"],"prefix":"10.1007","volume":"5","author":[{"given":"Yousef","family":"Baqraf","sequence":"first","affiliation":[]},{"given":"Pantea","family":"Keikhosrokiani","sequence":"additional","affiliation":[]},{"given":"Yu-N.","family":"Cheah","sequence":"additional","affiliation":[]},{"given":"Amany","family":"Alomarji","sequence":"additional","affiliation":[]},{"given":"Fatima","family":"Baqraf","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,11,23]]},"reference":[{"key":"679_CR1","doi-asserted-by":"publisher","first-page":"1884","DOI":"10.1007\/s11606-019-05109-0","volume":"34","author":"L Daraz","year":"2019","unstructured":"Daraz L, Morrow AS, Ponce OJ, Beuschel B, Farah MH, Katabi A, et al. Can patients trust online health information? a meta-narrative systematic review addressing the quality of health information on the internet. J Gen Intern Med. 2019;34:1884\u201391.","journal-title":"J Gen Intern Med"},{"key":"679_CR2","volume-title":"The Philosophy of Information Quality","author":"P Ghezzi","year":"2024","unstructured":"Ghezzi P, Chumber S, Brabazon T. Educating medical students to evaluate the quality of health information on the web. In: Luciano F, Phyllis I, editors. The Philosophy of Information Quality. Cham: Springer International Publishing; 2024."},{"key":"679_CR3","unstructured":"Council of Europe: Algorithms and human rights. Report, Council of Europe (2018). [Online]. https:\/\/rm.coe.int\/algorithms-and-human-rights-en-rev\/16807956b5"},{"key":"679_CR4","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1016\/j.pec.2010.09.004","volume":"81","author":"C-J Lee","year":"2010","unstructured":"Lee C-J, Gray SW, Lewis N. Internet use leads cancer patients to be active health care consumers. Patient Educ Couns. 2010;81:63\u20139.","journal-title":"Patient Educ Couns"},{"issue":"2","key":"679_CR5","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1002\/asi.10017","volume":"53","author":"SY Rieh","year":"2002","unstructured":"Rieh SY. Judgment of information quality and cognitive authority in the web. J Am Soc Inform Sci Technol. 2002;53(2):145\u201361.","journal-title":"J Am Soc Inform Sci Technol"},{"issue":"7337","key":"679_CR6","doi-asserted-by":"publisher","first-page":"573","DOI":"10.1136\/bmj.324.7337.573","volume":"324","author":"G Eysenbach","year":"2002","unstructured":"Eysenbach G, K\u00f6hler C. How do consumers search for and appraise health information on the world wide web? qualitative study using focus groups, usability tests, and in-depth interviews. BMJ. 2002;324(7337):573\u20137.","journal-title":"BMJ"},{"issue":"7337","key":"679_CR7","doi-asserted-by":"publisher","first-page":"573","DOI":"10.1136\/bmj.324.7337.573","volume":"324","author":"G Eysenbach","year":"2002","unstructured":"Eysenbach G, K\u00f6hler C. How do consumers search for and appraise health information on the world wide web? qualitative study using focus groups, usability tests, and in-depth interviews. BMJ. 2002;324(7337):573\u20137.","journal-title":"BMJ"},{"key":"679_CR8","unstructured":"Eysenbach G. Credibility of health information and digital media: new perspectives and implications for youth, 2007."},{"key":"679_CR9","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-07121-3_2","volume-title":"The Philosophy of Information Quality","author":"P Illari","year":"2014","unstructured":"Illari P, Floridi L. Information quality, data and philosophy. In: Luciano F, Phyllis I, editors. The Philosophy of Information Quality. Berlin: Springer; 2014."},{"key":"679_CR10","doi-asserted-by":"crossref","unstructured":"Baqraf Y, Keikhosrokiani P. The prediction of health information quality perception using machine learning and deep learning techniques. In: 2023 11th International Conference on Information and Communication Technology (ICoICT), IEEE: 2023; pp. 104\u2013109.","DOI":"10.1109\/ICoICT58202.2023.10262623"},{"key":"679_CR11","doi-asserted-by":"crossref","unstructured":"Pogacar FA, Ghenai A, Smucker MD, Clarke CL. The positive and negative influence of search results on people\u2019s decisions about the efficacy of medical treatments. In: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, 2017; pp. 209\u2013216.","DOI":"10.1145\/3121050.3121074"},{"key":"679_CR12","unstructured":"Abualsaud M, Smucker MD. Exposure and order effects of misinformation on health search decisions. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Rome 2019."},{"key":"679_CR13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12911-021-01413-0","volume":"21","author":"MS Al-Ak\u2019hali","year":"2021","unstructured":"Al-Ak\u2019hali MS, Fageeh HN, Halboub E, Alhajj MN, Ariffin Z. Quality and readability of web-based Arabic health information on periodontal disease. BMC Med Inform Decis Mak. 2021;21:1\u20138.","journal-title":"BMC Med Inform Decis Mak"},{"issue":"1","key":"679_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12889-021-10218-9","volume":"21","author":"E Halboub","year":"2021","unstructured":"Halboub E, Al-Ak\u2019hali MS, Al-Mekhlafi HM, Alhajj MN. Quality and readability of web-based Arabic health information on covid-19: an infodemiological study. BMC Public Health. 2021;21(1):1\u20137.","journal-title":"BMC Public Health"},{"issue":"4","key":"679_CR15","doi-asserted-by":"publisher","first-page":"961","DOI":"10.31557\/APJCP.2020.21.4.961","volume":"21","author":"MS Alakhali","year":"2020","unstructured":"Alakhali MS. Quality assessment of information on oral cancer provided at Arabic speaking websites. Asian Pac J Cancer Prev APJCP. 2020;21(4):961.","journal-title":"Asian Pac J Cancer Prev APJCP"},{"key":"679_CR16","doi-asserted-by":"crossref","unstructured":"Boutayeb, W.: The growing trend of noncommunicable diseases in arab countries. Disease prevention and health promotion in developing countries, 2020; pp. 61\u201372.","DOI":"10.1007\/978-3-030-34702-4_5"},{"key":"679_CR17","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1007\/s10654-015-0026-5","volume":"30","author":"L Chaker","year":"2015","unstructured":"Chaker L, Falla A, Lee SJ, Muka T, Imo D, Jaspers L, et al. The global impact of non-communicable diseases on macro-economic productivity: a systematic review. Eur J Epidemiol. 2015;30:357\u201395.","journal-title":"Eur J Epidemiol"},{"issue":"7508","key":"679_CR18","doi-asserted-by":"publisher","first-page":"147","DOI":"10.1038\/511147a","volume":"511","author":"LO Gostin","year":"2014","unstructured":"Gostin LO. Non-communicable diseases: healthy living needs global governance. Nature. 2014;511(7508):147\u20139.","journal-title":"Nature"},{"issue":"4","key":"679_CR19","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1093\/intqhc\/mzy033","volume":"30","author":"N Zakaria","year":"2018","unstructured":"Zakaria N, AlFakhry O, Matbuli A, Alzahrani A, Arab NSS, Madani A, et al. Development of Saudi e-health literacy scale for chronic diseases in Saudi Arabia: using integrated health literacy dimensions. Int J Qual Health Care. 2018;30(4):321\u20138.","journal-title":"Int J Qual Health Care"},{"issue":"22","key":"679_CR20","doi-asserted-by":"publisher","first-page":"8615","DOI":"10.3390\/s22228615","volume":"22","author":"A Mavrogiorgou","year":"2022","unstructured":"Mavrogiorgou A, Kiourtis A, Kleftakis S, Mavrogiorgos K, Zafeiropoulos N, Kyriazis D. A catalogue of machine learning algorithms for healthcare risk predictions. Sensors. 2022;22(22):8615.","journal-title":"Sensors"},{"key":"679_CR21","doi-asserted-by":"crossref","unstructured":"Zafeiropoulos N, Mavrogiorgou A, Kleftakis S, Mavrogiorgos K, Kiourtis A, Kyriazis D. Interpretable stroke risk prediction using machine learning algorithms. In: Intelligent Sustainable Systems: Selected Papers of WorldS4 2022. Volume 2. Berlin: Springer; 2023. p. 647\u201356.","DOI":"10.1007\/978-981-19-7663-6_61"},{"key":"679_CR22","doi-asserted-by":"crossref","unstructured":"Ferrara E. Large language models for wearable sensor-based human activity recognition, health monitoring, and behavioral modeling: a survey of early trends, datasets, and challenges. arXiv preprint arXiv:2407.07196 2024.","DOI":"10.20944\/preprints202407.0970.v1"},{"key":"679_CR23","doi-asserted-by":"crossref","unstructured":"Schlicht IB, Zhao Z, Sayin B, Flek L, Rosso P. Do llms provide consistent answers to health-related questions across languages? In: European Conference on Information Retrieval. Berlin: Springer; 2025; pp. 314\u2013322 .","DOI":"10.1007\/978-3-031-88714-7_30"},{"key":"679_CR24","doi-asserted-by":"crossref","unstructured":"Re\u0161\u010di\u010d N, Alberts J, Altenburg TM, Chinapaw MJ, De\u00a0Nigro A, Fenoglio D, Gjoreski M, Gradi\u0161ek A, Jurak G, Kiourtis A, et al. Smartchange: Ai-based long-term health risk evaluation for driving behaviour change strategies in children and youth. In: 2023 International Conference on Applied Mathematics & Computer Science (ICAMCS). IEEE; 2023; pp. 81\u201389.","DOI":"10.1109\/ICAMCS59110.2023.00020"},{"key":"679_CR25","doi-asserted-by":"crossref","unstructured":"Mavrogiorgou, A., Kleftakis, S., Mavrogiorgos, K., Zafeiropoulos, N., Menychtas, A., Kiourtis, A., Maglogiannis, I., Kyriazis, D.: behealthier: a microservices platform for analyzing and exploiting healthcare data. In: 2021 IEEE 34th International Symposium on Computer-based Medical Systems (CBMS). IEEE, 2021; pp. 283\u2013288.","DOI":"10.1109\/CBMS52027.2021.00078"},{"key":"679_CR26","doi-asserted-by":"publisher","first-page":"205520762312122","DOI":"10.1177\/20552076231212296","volume":"9","author":"YKA Baqraf","year":"2023","unstructured":"Baqraf YKA, Keikhosrokiani P, Al-Rawashdeh M. Evaluating online health information quality using machine learning and deep learning: a systematic literature review. Digital Health. 2023;9:20552076231212296.","journal-title":"Digital Health"},{"issue":"4","key":"679_CR27","doi-asserted-by":"publisher","first-page":"2173","DOI":"10.3390\/ijerph19042173","volume":"19","author":"S Di Sotto","year":"2022","unstructured":"Di Sotto S, Viviani M. Health misinformation detection in the social web: an overview and a data science approach. Int J Environ Res Public Health. 2022;19(4):2173.","journal-title":"Int J Environ Res Public Health"},{"issue":"4","key":"679_CR28","doi-asserted-by":"publisher","first-page":"379","DOI":"10.2174\/1573406416666200611104143","volume":"19","author":"AS Bhagavathula","year":"2021","unstructured":"Bhagavathula AS, Shehab A, Ullah A, Rahmani J. The burden of cardiovascular disease risk factors in the middle east: a systematic review and meta-analysis focusing on primary prevention. Curr Vasc Pharmacol. 2021;19(4):379\u201389.","journal-title":"Curr Vasc Pharmacol"},{"key":"679_CR29","unstructured":"Organization WH. Cardiovascular diseases (CVDs). World Health Organization. Retrieved on 11 June 2021 2021. https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/cardiovascular-diseases-(cvds)"},{"issue":"4","key":"679_CR30","doi-asserted-by":"publisher","first-page":"379","DOI":"10.2174\/1573406416666200611104143","volume":"19","author":"AS Bhagavathula","year":"2021","unstructured":"Bhagavathula AS, Shehab A, Ullah A, Rahmani J. The burden of cardiovascular disease risk factors in the middle east: a systematic review and meta-analysis focusing on primary prevention. Curr Vasc Pharmacol. 2021;19(4):379\u201389.","journal-title":"Curr Vasc Pharmacol"},{"issue":"5","key":"679_CR31","doi-asserted-by":"publisher","first-page":"428","DOI":"10.1038\/s41371-021-00554-z","volume":"36","author":"M Abboud","year":"2022","unstructured":"Abboud M, Karam S. Hypertension in the middle east: current state, human factors, and barriers to control. J Hum Hypertens. 2022;36(5):428\u201336.","journal-title":"J Hum Hypertens"},{"issue":"1","key":"679_CR32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-022-06418-x","volume":"12","author":"M Jaberinezhad","year":"2022","unstructured":"Jaberinezhad M, Farhoudi M, Nejadghaderi SA, Alizadeh M, Sullman MJ, Carson-Chahhoud K, et al. The burden of stroke and its attributable risk factors in the middle east and north Africa region, 1990\u20132019. Sci Rep. 2022;12(1):1\u201311.","journal-title":"Sci Rep"},{"key":"679_CR33","unstructured":"Bianchini S, M\u00fcller M, Pelletier P. Deep learning in science. arXiv preprint arXiv:2009.01575 2020."},{"key":"679_CR34","doi-asserted-by":"crossref","unstructured":"Clarke Z, Ghezzi P. Analysis of online information available for treatment of depression. medRxiv, 19008284 2019.","DOI":"10.1101\/19008284"},{"key":"679_CR35","doi-asserted-by":"crossref","unstructured":"Baqraf Y, Keikhosrokiani P. Health information quality assessment using artificial intelligence: quality dimensions from healthcare professionals\u2019 perspective. In: International Conference of Reliable Information and Communication Technology; Berlin: Springer. 2024; pp. 1\u201314.","DOI":"10.1007\/978-3-031-59711-4_1"},{"issue":"11","key":"679_CR36","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1145\/240455.240479","volume":"39","author":"Y Wand","year":"1996","unstructured":"Wand Y, Wang RY. Anchoring data quality dimensions in ontological foundations. Commun ACM. 1996;39(11):86\u201395.","journal-title":"Commun ACM"},{"issue":"2","key":"679_CR37","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1136\/jech.53.2.105","volume":"53","author":"D Charnock","year":"1999","unstructured":"Charnock D, Shepperd S, Needham G, Gann R. Discern: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health. 1999;53(2):105\u201311.","journal-title":"J Epidemiol Community Health"},{"issue":"5","key":"679_CR38","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1016\/S0010-4825(98)00037-7","volume":"28","author":"C Boyer","year":"1998","unstructured":"Boyer C, Selby M, Scherrer J-R, Appel R. The health on the net code of conduct for medical and health websites. Comput Biol Med. 1998;28(5):603\u201310.","journal-title":"Comput Biol Med"},{"issue":"15","key":"679_CR39","doi-asserted-by":"publisher","first-page":"1244","DOI":"10.1001\/jama.1997.03540390074039","volume":"277","author":"WM Silberg","year":"1997","unstructured":"Silberg WM, Lundberg GD, Musacchio RA. Assessing, controlling, and assuring the quality of medical information on the internet: Caveant lector et viewor-let the reader and viewer beware. JAMA. 1997;277(15):1244\u20135.","journal-title":"JAMA"},{"key":"679_CR40","doi-asserted-by":"publisher","first-page":"204","DOI":"10.3389\/fpubh.2015.00204","volume":"3","author":"M Yaqub","year":"2015","unstructured":"Yaqub M, Ghezzi P. Adding dimensions to the analysis of the quality of health information of websites returned by google: cluster analysis identifies patterns of websites according to their classification and the type of intervention described. Front Public Health. 2015;3:204.","journal-title":"Front Public Health"},{"key":"679_CR41","doi-asserted-by":"crossref","unstructured":"Al-Jefri MM, Evans R, Ghezzi P, Uchyigit G. Using machine learning for automatic identification of evidence-based health information on the web. In: Proceedings of the 2017 International Conference on Digital Health, 2017; pp. 167\u2013174.","DOI":"10.1145\/3079452.3079470"},{"key":"679_CR42","unstructured":"Hinkle JL, Cheever KH, Overbaugh K. Brunner and Suddarth\u2019s Textbook of Medical-Surgical Nursing_0781731933.pdf. 2021."},{"key":"679_CR43","unstructured":"Gommers R, Virtanen P, Burovski E, Weckesser W, Oliphant TE, Haberland M, Cournapeau D, Reddy T, Peterson P, Nelson A, et al. scipy\/scipy: Scipy 1.13. Zenodo, 2024."},{"issue":"260","key":"679_CR44","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1080\/01621459.1952.10483441","volume":"47","author":"WH Kruskal","year":"1952","unstructured":"Kruskal WH, Wallis WA. Use of ranks in one-criterion variance analysis. J Am Stat Assoc. 1952;47(260):583\u2013621.","journal-title":"J Am Stat Assoc"},{"key":"679_CR45","unstructured":"Cohen MX. Modern statistics: Intuition, math, python, r. (No Title) 2023."},{"key":"679_CR46","unstructured":"Tomczak M, Tomczak E. The need to report effect size estimates revisited. an overview of some recommended measures of effect size 2014."},{"key":"679_CR47","doi-asserted-by":"publisher","DOI":"10.4324\/9780203771587","volume-title":"Statistical power analysis for the behavioral sciences","author":"J Cohen","year":"2013","unstructured":"Cohen J. Statistical power analysis for the behavioral sciences. Milton Park: Routledge; 2013."},{"issue":"36","key":"679_CR48","doi-asserted-by":"publisher","first-page":"1169","DOI":"10.21105\/joss.01169","volume":"4","author":"MA Terpilowski","year":"2019","unstructured":"Terpilowski MA. scikit-posthocs: pairwise multiple comparison tests in python. J Open Source Softw. 2019;4(36):1169.","journal-title":"J Open Source Softw"},{"key":"679_CR49","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al. Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst, 2019;32."},{"key":"679_CR50","doi-asserted-by":"crossref","unstructured":"Safaya A, Abdullatif M, Yuret D. KUISAIL at SemEval-2020 task 12: BERT-CNN for offensive speech identification in social media. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 2054\u20132059. International Committee for Computational Linguistics, Barcelona (online) 2020. https:\/\/www.aclweb.org\/anthology\/2020.semeval-1.271","DOI":"10.18653\/v1\/2020.semeval-1.271"},{"issue":"4","key":"679_CR51","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1002\/wics.101","volume":"2","author":"H Abdi","year":"2010","unstructured":"Abdi H, Williams LJ. Principal component analysis. Wiley Interdiscip Rev Comput Stat. 2010;2(4):433\u201359.","journal-title":"Wiley Interdiscip Rev Comput Stat"},{"key":"679_CR52","unstructured":"Eddine MK, Tomeh N, Habash N, Roux JL, Vazirgiannis M. Arabart: a pretrained arabic sequence-to-sequence model for abstractive summarization. arXiv preprint arXiv:2203.10945 2022."},{"key":"679_CR53","unstructured":"Antoun W, Baly F, Hajj H. Arabert: transformer-based model for Arabic language understanding. In: LREC 2020 Workshop Language Resources and Evaluation Conference 11\u201316 May 2020, p. 9."},{"key":"679_CR54","doi-asserted-by":"crossref","unstructured":"Khasawneh MAS, Khasawneh YJA. Achieving assessment equity and fairness: identifying and eliminating bias in assessment tools and practices 2023.","DOI":"10.20944\/preprints202306.0730.v1"},{"issue":"19","key":"679_CR55","doi-asserted-by":"publisher","first-page":"8860","DOI":"10.3390\/app14198860","volume":"14","author":"K Mavrogiorgos","year":"2024","unstructured":"Mavrogiorgos K, Kiourtis A, Mavrogiorgou A, Menychtas A, Kyriazis D. Bias in machine learning: a literature review. Appl Sci. 2024;14(19):8860.","journal-title":"Appl Sci"},{"key":"679_CR56","unstructured":"Ma E. NLP Augmentation. 2019. https:\/\/github.com\/makcedward\/nlpaug"},{"key":"679_CR57","doi-asserted-by":"crossref","unstructured":"Canca C. Did You Find It on the Internet? Ethical Complexities of Search Engine Rankings. 2022.","DOI":"10.1007\/978-3-030-86144-5_47"},{"key":"679_CR58","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12911-020-01131-z","volume":"20","author":"L Kinkead","year":"2020","unstructured":"Kinkead L, Allam A, Krauthammer M. Autodiscern: rating the quality of online health information with hierarchical encoder attention-based neural networks. BMC Med Inform Decis Mak. 2020;20:1\u201313.","journal-title":"BMC Med Inform Decis Mak"},{"key":"679_CR59","doi-asserted-by":"crossref","unstructured":"Refai D, Abo-Soud S, Abdel-Rahman M. Data augmentation using transformers and similarity measures for improving arabic text classification, 2023. https:\/\/arxiv.org\/abs\/2212.13939","DOI":"10.1109\/ACCESS.2023.3336311"},{"key":"679_CR60","doi-asserted-by":"crossref","unstructured":"ElSabagh AA, Azab SS, Hefny HA. A comprehensive survey on Arabic text augmentation: approaches, challenges, and applications. Neural Comput Appl, 2025;1\u201334.","DOI":"10.1007\/s00521-025-11020-z"},{"issue":"1","key":"679_CR61","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1080\/14737167.2020.1734453","volume":"20","author":"U Ozolins","year":"2020","unstructured":"Ozolins U, Hale S, Cheng X, Hyatt A, Schofield P. Translation and back-translation methodology in health research-a critique. Expert Rev Pharmacoecon Outcomes Res. 2020;20(1):69\u201377.","journal-title":"Expert Rev Pharmacoecon Outcomes Res"},{"issue":"1","key":"679_CR62","doi-asserted-by":"publisher","first-page":"146045822110709","DOI":"10.1177\/14604582211070998","volume":"28","author":"A Alasmari","year":"2022","unstructured":"Alasmari A, Alhothali A, Allinjawi A. Hybrid machine learning approach for Arabic medical web page credibility assessment. Health Informatics J. 2022;28(1):14604582211070998.","journal-title":"Health Informatics J"},{"key":"679_CR63","doi-asserted-by":"crossref","unstructured":"Knight S-A., Burn J. Developing a framework for assessing information quality on the world wide web. Informing Science 2005;8.","DOI":"10.28945\/2854"},{"key":"679_CR64","doi-asserted-by":"crossref","unstructured":"Schwarz J, Morris M. Augmenting web pages and search results to support credibility assessment. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2011;pp. 1245\u20131254.","DOI":"10.1145\/1978942.1979127"},{"key":"679_CR65","doi-asserted-by":"crossref","unstructured":"Sondhi, P., Vydiswaran, V.V., Zhai, C.: Reliability prediction of webpages in the medical domain. In: ECIR, Berlin: Springer, 2012;12:219\u2013231.","DOI":"10.1007\/978-3-642-28997-2_19"},{"key":"679_CR66","unstructured":"Rangapur A, Wang H, Shu K. Investigating online financial misinformation and its consequences: A computational perspective. arXiv preprint arXiv:2309.12363 2023."},{"key":"679_CR67","unstructured":"Shiang LS, Wilson S. Unraveling fake news in Malaysia: a comprehensive analysis from legal and journalistic perspective. Plaridel 2024;21(1)."},{"key":"679_CR68","doi-asserted-by":"publisher","first-page":"1531126","DOI":"10.3389\/fcomm.2024.1531126","volume":"9","author":"JC-E Liu","year":"2025","unstructured":"Liu JC-E, Lee C-F. Climate and energy misinformation in Taiwan. Front commun. 2025;9:1531126.","journal-title":"Front commun"},{"issue":"1","key":"679_CR69","doi-asserted-by":"publisher","first-page":"59345","DOI":"10.2196\/59345","volume":"13","author":"M Vivion","year":"2024","unstructured":"Vivion M, Trottier V, Bouh\u00ealier \u00c8, Goupil-Sormany I, Diallo T, et al. Misinformation about climate change and related environmental events on social media: Protocol for a scoping review. JMIR Res Protocols. 2024;13(1):59345.","journal-title":"JMIR Res Protocols"}],"container-title":["Discover Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-025-00679-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44163-025-00679-x","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-025-00679-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T21:24:17Z","timestamp":1767043457000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44163-025-00679-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,23]]},"references-count":69,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["679"],"URL":"https:\/\/doi.org\/10.1007\/s44163-025-00679-x","relation":{},"ISSN":["2731-0809"],"issn-type":[{"value":"2731-0809","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,23]]},"assertion":[{"value":"8 May 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 November 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 November 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"This study involves collecting and analyzing web pages with no human participants involved. Therefore, ethics approval was not required. Our study involves the collection of web pages and does not include human participants. As a result, patient consent is not required for this research.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that there are no Conflict of interest related to the research, authorship, or publication of this article.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"411"}}