{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T16:31:06Z","timestamp":1778085066470,"version":"3.51.4"},"reference-count":33,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T00:00:00Z","timestamp":1756944000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>The development of generative AI Large Language Models (LLMs) raised the alarm regarding the identification of content produced by generative AI vs. humans. In one case, issues arise when students heavily rely on such tools in a manner that can affect the development of their writing or coding skills. Other issues of plagiarism also apply. This study aims to support efforts to detect and identify textual content generated using LLM tools. We hypothesize that LLM-generated text is detectable by machine learning (ML) and investigate ML models that can recognize and differentiate between texts generated by humans and multiple LLM tools. We used a dataset of student-written text in comparison with LLM-written text. We leveraged several ML and Deep Learning (DL) algorithms, such as Random Forest (RF) and Recurrent Neural Networks (RNNs) and utilized Explainable Artificial Intelligence (XAI) to understand the important features in attribution. Our method is divided into (1) binary classification to differentiate between human-written and AI-generated text and (2) multi-classification to differentiate between human-written text and text generated by five different LLM tools (ChatGPT, LLaMA, Google Bard, Claude, and Perplexity). Results show high accuracy in multi- and binary classification. Our model outperformed GPTZero (78.3%), with an accuracy of 98.5%. Notably, GPTZero was unable to recognize about 4.2% of the observations, but our model was able to recognize the complete test dataset. XAI results showed that understanding feature importance across different classes enables detailed author\/source profiles, aiding in attribution and supporting plagiarism detection by highlighting unique stylistic and structural elements, thereby ensuring robust verification of content originality.<\/jats:p>","DOI":"10.3390\/info16090767","type":"journal-article","created":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T07:46:17Z","timestamp":1757058377000},"page":"767","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLM-Generated Text"],"prefix":"10.3390","volume":"16","author":[{"given":"Ayat A.","family":"Najjar","sequence":"first","affiliation":[{"name":"AI and Data Science Department, Arab American University, Jenin P.O Box 240, Palestine"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6835-8338","authenticated-orcid":false,"given":"Huthaifa I.","family":"Ashqar","sequence":"additional","affiliation":[{"name":"AI and Data Science Department, Arab American University, Jenin P.O Box 240, Palestine"},{"name":"AI Program, Columbia University, New York, NY 10027, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8346-7148","authenticated-orcid":false,"given":"Omar","family":"Darwish","sequence":"additional","affiliation":[{"name":"IoT and Cybersecurity Lab, Eastern Michigan University, Ypsilanti, MI 48197, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6069-1550","authenticated-orcid":false,"given":"Eman","family":"Hammad","sequence":"additional","affiliation":[{"name":"iSTAR Lab, Texas A&M University, College Station, TX 77840, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Hadi, M.U., Qureshi, R., Shah, A., Irfan, M., Zafar, A., Shaikh, M.B., Akhtar, N., Wu, J., and Mirjalili, S. (2023). A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Prepr.","DOI":"10.36227\/techrxiv.23589741.v1"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"94","DOI":"10.14429\/djlit.39.2.13622","article-title":"Plagiarism and academic misconduct: A systematic review","volume":"39","author":"Awasthi","year":"2019","journal-title":"Desidoc J. Libr. Inf. Technol."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Tami, M., Ashqar, H.I., and Elhenawy, M. (2024). Automated Question Generation for Science Tests in Arabic Language Using NLP Techniques. arXiv.","DOI":"10.1007\/978-3-031-82377-0_24"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Sammoudi, M., Habaybeh, A., Ashqar, H.I., and Elhenawy, M. (2024). Question-Answering (QA) Model for a Personalized Learning Assistant for Arabic Language. arXiv.","DOI":"10.1007\/978-3-031-82377-0_30"},{"key":"ref_5","unstructured":"AopenAI (2024, February 02). Introducing ChatGPT. Available online: https:\/\/openai.com\/blog\/chatgpt."},{"key":"ref_6","unstructured":"Jayaseelan, N. (2024, February 02). lama 2, A New Intelligent Open Source Language Model. Available online: https:\/\/www.e2enetworks.com\/blog\/llama-2-the-new-open-source-language-model."},{"key":"ref_7","unstructured":"Team, S. (2024, February 02). Google Bard: Uses, Limitations, and Tips for More Helpful Answers. Available online: https:\/\/www.semrush.com\/blog\/google-bard\/?."},{"key":"ref_8","unstructured":"Anthropic (2024, February 02). Introducing Claude. Available online: https:\/\/www.anthropic.com\/index\/introducing-claude."},{"key":"ref_9","unstructured":"Perplexity (2024, February 02). Introducing PPLX Online LLMs. Available online: https:\/\/blog.perplexity.ai\/blog\/introducing-pplx-online-llms."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/IJWSR.338222","article-title":"Predictive analytics in mental health leveraging llm embeddings and machine learning models for social media analysis","volume":"21","author":"Radwan","year":"2024","journal-title":"Int. J. Web Serv. Res. (IJWSR)"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Masri, S., Raddad, Y., Khandaqji, F., Ashqar, H.I., and Elhenawy, M. (2024). Transformer Models in Education: Summarizing Science Textbooks with AraBART, MT5, AraT5, and mBART. arXiv.","DOI":"10.1007\/978-3-031-82377-0_25"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2422","DOI":"10.3390\/smartcities7050095","article-title":"Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data","volume":"7","author":"Jaradat","year":"2024","journal-title":"Smart Cities"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1016\/j.jpurol.2023.05.018","article-title":"ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine","volume":"19","author":"Kim","year":"2023","journal-title":"J. Pediatr. Urol."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Nam, D., Macvean, A., Hellendoorn, V., Vasilescu, B., and Myers, B. (2024, January 14\u201320). Using an llm to help with code understanding. Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering, New York, NY, USA.","DOI":"10.1145\/3597503.3639187"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Hunt, E., Janamsetty, R., Kinares, C., Koh, C., Sanchez, A., Zhan, F., Ozdemir, M., Waseem, S., Yolcu, O., and Dahal, B. (2019, January 10\u201311). Machine learning models for paraphrase identification and its applications on plagiarism detection. Proceedings of the 2019 IEEE International Conference on Big Knowledge (ICBK), Beijing, China.","DOI":"10.1109\/ICBK.2019.00021"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"AlSallal, M., Iqbal, R., Amin, S., James, A., and Palade, V. (September, January 31). An integrated machine learning approach for extrinsic plagiarism detection. Proceedings of the 2016 9th International Conference on Developments in eSystems Engineering (DeSE), Liverpool, UK.","DOI":"10.1109\/DeSE.2016.1"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Anguita, A., Beghelli, A., and Creixell, W. (2011, January 27\u201329). Automatic cross-language plagiarism detection. Proceedings of the 2011 7th International Conference on Natural Language Processing and Knowledge Engineering, Tokushima, Japan.","DOI":"10.1109\/NLPKE.2011.6138189"},{"key":"ref_18","unstructured":"Kikuchi, H., Goto, T., Wakatsuki, M., and Nishino, T. (July, January 30). A source code plagiarism detecting method using alignment with abstract syntax tree elements. Proceedings of the 15th IEEE\/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel\/Distributed Computing (SNPD), Las Vegas, NV, USA."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Suleiman, D., Awajan, A., and Al-Madi, N. (2017, January 11\u201313). Deep learning based technique for plagiarism detection in Arabic texts. Proceedings of the 2017 International Conference on New Trends in Computing Sciences (ICTCS), Amman, Jordan.","DOI":"10.1109\/ICTCS.2017.42"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1145\/3624725","article-title":"The science of detecting llm-generated text","volume":"67","author":"Tang","year":"2024","journal-title":"Commun. Acm"},{"key":"ref_21","unstructured":"Wu, J., Yang, S., Zhan, R., Yuan, Y., Wong, D.F., and Chao, L.S. (2023). A survey on llm-gernerated text detection: Necessity, methods, and future directions. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Hayawi, K., Shahriar, S., and Mathew, S.S. (2023). The imitation game: Detecting human and ai-generated texts in the era of large language models. arXiv.","DOI":"10.1177\/01655515241227531"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Orenstrakh, M.S., Karnalim, O., Suarez, C.A., and Liut, M. (2023). Detecting llm-generated text in computing education: A comparative study for chatgpt cases. arXiv.","DOI":"10.1109\/COMPSAC61105.2024.00027"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Chen, L., Ding, X., Emani, M., Vanderbruggen, T., Lin, P.H., and Liao, C. (2023, January 12\u201317). Data race detection using large language models. Proceedings of the SC\u201923 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, USA.","DOI":"10.1145\/3624062.3624088"},{"key":"ref_25","unstructured":"Lab, T.L.A. (2024, February 02). LLM\u2014Detect AI Generated Text | Kaggle. Available online: https:\/\/www.kaggle.com\/competitions\/llm-detect-ai-generated-text\/data."},{"key":"ref_26","first-page":"4864","article-title":"A survey on text pre-processing & feature extraction techniques in natural language processing","volume":"7","author":"Tabassum","year":"2020","journal-title":"Int. Res. J. Eng. Technol. (IRJET)"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"27","DOI":"10.56471\/slujst.v4i.266","article-title":"Sentiment classification: Review of text vectorization methods: Bag of words, Tf-Idf, Word2vec and Doc2vec","volume":"4","author":"Abubakar","year":"2022","journal-title":"Slu J. Sci. Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1162\/tacl_a_00041","article-title":"Data statements for NLP: Toward Mitigating System Bias and Enabling Better Science","volume":"6","author":"Bender","year":"2018","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_29","unstructured":"GPTZero (2024, February 02). GPTZero|The Trusted AI Detector for ChatGPT, GPT-4, and More. Available online: https:\/\/gptzero.me\/."},{"key":"ref_30","unstructured":"Svrluga, S. (2024, February 02). Princeton Student Creates GPTZero Tool to Detect ChatGPT-Generated Text. Available online: https:\/\/www.washingtonpost.com\/education\/2023\/01\/12\/gptzero-chatgpt-detector-ai\/."},{"key":"ref_31","unstructured":"Wikipedia (2024, February 02). GPTZero. Available online: https:\/\/en.wikipedia.org\/wiki\/GPTZero."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1516083870","DOI":"10.3346\/jkms.2023.38.e319","article-title":"GPTZero performance in identifying artificial intelligence-generated medical texts: A preliminary study","volume":"38","author":"Habibzadeh","year":"2023","journal-title":"J. Korean Med. Sci."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Heumann, M., Kraschewski, T., and Breitner, M.H. (2025, July 06). ChatGPT and GPTZero in Research and Social Media: A Sentiment-and Topic-Based Analysis. Available online: https:\/\/aisel.aisnet.org\/amcis2023\/sig_hci\/sig_hci\/6.","DOI":"10.2139\/ssrn.4467646"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/9\/767\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:39:31Z","timestamp":1760035171000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/9\/767"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,4]]},"references-count":33,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["info16090767"],"URL":"https:\/\/doi.org\/10.3390\/info16090767","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,4]]}}}