{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T09:20:13Z","timestamp":1774689613295,"version":"3.50.1"},"reference-count":59,"publisher":"IGI Global","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,7,1]]},"abstract":"<p>The information world is rich of documents in different formats or applications, such as databases, digital libraries, and the Web. Text classification is used for aiding search functionality offered by search engines and information retrieval systems to deal with the large number of documents on the web. Many research papers, conducted within the field of text classification, were applied to English, Dutch, Chinese, and other languages, whereas fewer were applied to Arabic language. This paper addresses the issue of automatic classification or classification of Arabic text documents. It applies text classification to Arabic language text documents using stemming as part of the preprocessing steps. Results have showed that applying text classification without using stemming; the support vector machine (SVM) classifier has achieved the highest classification accuracy using the two test modes with 87.79% and 88.54%. On the other hand, stemming has negatively affected the accuracy, where the SVM accuracy using the two test modes dropped down to 84.49% and 86.35%.<\/p>","DOI":"10.4018\/ijirr.2011070104","type":"journal-article","created":{"date-parts":[[2011,10,19]],"date-time":"2011-10-19T16:28:24Z","timestamp":1319041704000},"page":"54-70","source":"Crossref","is-referenced-by-count":19,"title":["The Effect of Stemming on Arabic Text Classification"],"prefix":"10.4018","volume":"1","author":[{"given":"Abdullah","family":"Wahbeh","sequence":"first","affiliation":[{"name":"Dakota State University, USA"}]},{"given":"Mohammed","family":"Al-Kabi","sequence":"additional","affiliation":[{"name":"Yarmouk University, Jordan"}]},{"given":"Qasem","family":"Al-Radaideh","sequence":"additional","affiliation":[{"name":"Yarmouk University, Jordan"}]},{"given":"Emad","family":"Al-Shawakfa","sequence":"additional","affiliation":[{"name":"Yarmouk University, Jordan"}]},{"given":"Izzat","family":"Alsmadi","sequence":"additional","affiliation":[{"name":"Yarmouk University, Jordan"}]}],"member":"2432","reference":[{"key":"ijirr.2011070104-0","doi-asserted-by":"crossref","unstructured":"Abbasi, A., & Chen, H. (2005). Applying authorship analysis to Arabic Web content. In P. Kantor, G. Muresan, F. Roberts, D. D. Zeng, F.-Y. Wang, H. Chen, & R. C. Merkle (Eds.), Proceedings of the IEEE International Conference on Intelligence and Security Informatics, Atlanta, GA (LNCS 3495, pp. 75-93).","DOI":"10.1007\/11427995_15"},{"key":"ijirr.2011070104-1","unstructured":"Al-Harbi, S., Almuhareb, A., Al-Thubaity, A., Khorsheed, S., & Al-Rajeh, A. (2008). Automatic Arabic text classification. In Proceedings of the 9th International Conference on Statistical Analysis of Textual Data (pp. 77-83)."},{"key":"ijirr.2011070104-2","unstructured":"Al-Kabi, M., & Al-Mustafa, R. (2006). Arabic root based stemmer. In Proceedings of the International Arab Conference on Information Technology, Jordan."},{"key":"ijirr.2011070104-3","doi-asserted-by":"publisher","DOI":"10.1177\/0165551510392305"},{"key":"ijirr.2011070104-4","unstructured":"Al-Radaideh, Q. (2008). The impact of classification evaluation methods on rough sets based classifiers. In Proceedings of the International Arab Conference on Information Technology, Sfax, Tunisia."},{"key":"ijirr.2011070104-5","doi-asserted-by":"crossref","unstructured":"Al-Shammari, E. (2008). Towards an error free stemming. In Proceedings of the International Conference on Data Mining, Amsterdam, The Netherlands.","DOI":"10.1145\/1460027.1460030"},{"key":"ijirr.2011070104-6","unstructured":"Al-Shammari, E. T., & Lin, J. (2008). A new Arabic stemming algorithm. In Proceedings of the 2nd ISCA Workshop on Experimental Linguistics, Athens, Greece."},{"key":"ijirr.2011070104-7","doi-asserted-by":"crossref","unstructured":"Al-Shargabi, B., Al-Romimah, W., & Olayah, F. (2011). A comparative study for Arabic text classification algorithms based on stop words elimination. In Proceedings of the International Conference on Intelligent Semantic Web-Services and Applications, Amman, Jordan.","DOI":"10.1145\/1980822.1980833"},{"key":"ijirr.2011070104-8","doi-asserted-by":"publisher","DOI":"10.1002\/asi.21301"},{"key":"ijirr.2011070104-9","doi-asserted-by":"crossref","unstructured":"Cufoglu, A., Lohi, M., & Madani, K. (2008). A comparative study of selected classifiers with classification accuracy in user profiling. In Proceedings of the 7th International Conference on Machine Learning and Application, San Diego, CA (pp. 787-791).","DOI":"10.1109\/ICMLA.2008.139"},{"key":"ijirr.2011070104-10","unstructured":"Diab, M. (2009). Second generation tools (amira 2.0): Fast and robust tokenization, pos tagging, and base phrase chunking. In Proceedings of the Second International Conference on Arabic Language Resources and Tools (pp. 285-288)."},{"key":"ijirr.2011070104-11","doi-asserted-by":"crossref","unstructured":"Dumais, S., Platt, J., Heckerman, D., & Sahami, M. (1998). Inductive learning algorithms and representations for text categorization. In Proceedings of the Seventh International Conference on Information and Knowledge Management, Bethesda, MD.","DOI":"10.1145\/288627.288651"},{"issue":"2","key":"ijirr.2011070104-12","first-page":"125","article-title":"Arabic text categorization.","volume":"4","author":"R.Duwairi","year":"2007","journal-title":"International Arab Journal of Information Technology"},{"key":"ijirr.2011070104-13","doi-asserted-by":"crossref","unstructured":"Duwairi, R., Al-Refai, M., & Khasawneh, N. (2007). Stemming versus light stemming as feature selection techniques for Arabic text categorization. In Proceedings of the 4th International Conference on Innovations in Information Technology, Dubai, UAE.","DOI":"10.1109\/IIT.2007.4430403"},{"key":"ijirr.2011070104-14","unstructured":"El-Halees, A. (2006). Mining Arabic association rules for text classification. In Proceedings of the First International Conference on Mathematical Sciences, Gaza, Palestine."},{"issue":"1","key":"ijirr.2011070104-15","first-page":"157","article-title":"Arabic text classification using maximum entropy.","volume":"15","author":"A. M.El-Halees","year":"2010","journal-title":"The Islamic University Journal"},{"key":"ijirr.2011070104-16","doi-asserted-by":"crossref","unstructured":"El-Kourdi, M., Bensaid, A., & Rachidi, T. (2004). Automatic Arabic document categorization based on the Na\u00efve Bayes Algorithm. In Proceedings of the 20th Workshop on Computational Approaches to Arabic Script-based Languages, Geneva, Switzerland.","DOI":"10.3115\/1621804.1621819"},{"key":"ijirr.2011070104-17","unstructured":"Fang, Y., Parthasarathy, S., & Schwartz, F. (2001). Using clustering to boost text classification. In Proceedings of the IEEE International Conference on Data Mining (pp. 123-127)."},{"key":"ijirr.2011070104-18","unstructured":"Gharib, T. F., Habib, M. B., & Fayed, Z. T. (2009). Arabic text classification using support vector machines. International Journal of Computers and their Applications, 16(4), 192-199."},{"key":"ijirr.2011070104-19","unstructured":"Ghawanmeh, S., Al-Shalabi, R., Kanaan, G., Khanfar, K., & Rabab\u2019ah, S. (2005). An algorithm for extracting the root for the Arabic Language. In Proceedings of the 5th International Business Information Management Association Conference, Cairo, Egypt."},{"key":"ijirr.2011070104-20","doi-asserted-by":"crossref","unstructured":"Ghawanmeh, S., Al-Shalabi, R., Kanaan, G., Khanfar, K., & Rabab\u2019ah, S. (2009). Enhanced algorithm for extracting the root of Arabic words. In Proceedings of the Sixth International Conference on Computer Graphics, Imaging and Visualization, China (pp. 388-391).","DOI":"10.1109\/CGIV.2009.10"},{"key":"ijirr.2011070104-21","unstructured":"He, J., Tan, A. H., & Tan, C. L. (2000). A comparative study on Chinese text categorization methods. In Proceedings of the PRICAI International Workshop on Text and Web Mining."},{"key":"ijirr.2011070104-22","unstructured":"Kadri, Y., & Nie, J. Y. (2006). Effective stemming for Arabic information retrieval. The challenge of Arabic for NLP\/MT. In Proceedings of the International Conference at the British Computer Society (pp. 68-74)."},{"key":"ijirr.2011070104-23","doi-asserted-by":"publisher","DOI":"10.1002\/asi.20832"},{"key":"ijirr.2011070104-24","unstructured":"Khoja, S. (2011). Personal web page. Retrieved May 15, 2011, from http:\/\/zeus.cs.pacificu.edu\/shereen\/research.htm"},{"key":"ijirr.2011070104-25","unstructured":"Khoja, S., & Garside, R. (1999). Stemming Arabic text. Lancaster, UK: Lancaster University. Retrieved September 22, 1999, from http:\/\/www.comp.lancs.ac.uk\/computing\/users\/khoja\/stemmer.ps"},{"key":"ijirr.2011070104-26","unstructured":"Khreisat, L. (2006). Arabic text classification using N-gram frequency statistics a comparative study. In Proceedings of the International Conference on Data Mining, Las Vegas, NV."},{"key":"ijirr.2011070104-27","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2006.180"},{"key":"ijirr.2011070104-28","doi-asserted-by":"crossref","unstructured":"Largeron, C., Moulin, C., & Gery, M. (2011). Entropy based feature selection for text categorization. In Proceedings of the ACM Symposium on Applied Computing, TaiChung, Taiwan.","DOI":"10.1145\/1982185.1982389"},{"key":"ijirr.2011070104-29","unstructured":"Larkey, L. S., & Connell, M. E. (2001). Arabic information retrieval at UMass in TREC-10. In Proceedings of the Text Retrieval Conference, Gaithersburg, MA."},{"key":"ijirr.2011070104-30","doi-asserted-by":"crossref","unstructured":"Last, M., Markov, A., & Kandel, A. (2008). Multi-lingual detection of Web terrorist content. In H. Chen, F.-Y. Wang, C. C. Yang, D. Zeng, M. Chau, & K. Chang (Ed.), Proceedings of the International Conference on Intelligence and Security Informatics (LNCS 3917, pp. 16-30).","DOI":"10.1007\/11734628_3"},{"issue":"4","key":"ijirr.2011070104-31","article-title":"Performance evaluation of decision tree classifiers on medical datasets.","volume":"26","author":"D.Lavanya","year":"2011","journal-title":"International Journal of Computers and Applications"},{"key":"ijirr.2011070104-32","doi-asserted-by":"crossref","unstructured":"Lee, G., & Lee, G. G. (2004). MMR-based feature selection for text categorization. In Proceedings of the HLT-NAACL: Short Papers, Boston, MA.","DOI":"10.3115\/1613984.1613986"},{"key":"ijirr.2011070104-33","unstructured":"Mesleh, A. (2007, December 29-31). Support vector machines based Arabic language text classification system: feature selection comparative study. In Proceedings of the 12th WSEAS International Conference on Applied Mathematics, Cairo, Egypt."},{"key":"ijirr.2011070104-34","doi-asserted-by":"crossref","unstructured":"Mesleh, A. M., & Kanaan, G. (2008). Support vector machine text classification system: Using ant colony optimization based feature subset selection. In Proceedings of the International Conference on Computer Engineering & Systems (pp. 143-148).","DOI":"10.1109\/ICCES.2008.4772984"},{"key":"ijirr.2011070104-35","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2006.04.002"},{"key":"ijirr.2011070104-36","unstructured":"Mustafa, S., & Al-Radaideh, Q. (2001, November 13-15). Arabic word stemming using letter successor and predecessor variety. In Proceedings of the International Arab Conference on Information Technology, Jordan (pp. 216-222)."},{"key":"ijirr.2011070104-37","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-009-9113-0"},{"key":"ijirr.2011070104-38","doi-asserted-by":"crossref","unstructured":"Nwesri, A. F. A., Tahaghoghi, S. M. M., & Scholer, F. (2006, July 22-23). Capturing out-of-vocabulary words in Arabic text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sydney, Australia (pp. 258-266).","DOI":"10.3115\/1610075.1610113"},{"issue":"9","key":"ijirr.2011070104-39","article-title":"Stemming algorithm to classify Arabic documents.","volume":"7","author":"M. A. H.Omer","year":"2010","journal-title":"Journal of Communication and Computer"},{"key":"ijirr.2011070104-40","doi-asserted-by":"crossref","unstructured":"Peng, J., Yang, D., Tang, S., Gao, J., Zhang, P., & Fu, Y. (2007). A concept similarity based text classification algorithm. In Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery (pp. 535-539).","DOI":"10.1109\/FSKD.2007.11"},{"issue":"4","key":"ijirr.2011070104-41","first-page":"430","article-title":"The influence of preprocessing parameters on text categorization.","volume":"1","author":"J.Pomik\u00e1lek","year":"2007","journal-title":"International Journal of Applied Science Engineering and Technology"},{"key":"ijirr.2011070104-42","doi-asserted-by":"crossref","unstructured":"Qiu, L. Q., Zhao, R. Y., Zhou, G., & Yi, S. W. (2008). An extensive empirical study of feature selection for text categorization. In Proceedings of the 7th IEEE\/ACIS International Conference on computer and Information Science.","DOI":"10.1109\/ICIS.2008.49"},{"key":"ijirr.2011070104-43","unstructured":"Rahman, C. N., Sohel, F. A., Naushad, P., & Kamruzzaman, S. M. (2003). Text classification using the concept of association rule of data mining. In Proceedings of the International Conference on Information Technology, Kathmandu, Nepal (pp. 234-241)."},{"key":"ijirr.2011070104-44","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511486975","author":"K. C.Ryding","year":"2005","journal-title":"A reference grammar of modern standard Arabic"},{"key":"ijirr.2011070104-45","unstructured":"Said, D. A., Wanas, N. M., & Darwish, N. M. (2009). A study of text preprocessing tools for Arabic text categorization. In Proceedings of the Second International Conference on Arabic Language (pp. 230-236)."},{"key":"ijirr.2011070104-46","unstructured":"Sawaf, H., Zaplo, J., & Ney, H. (2001). Statistical classification methods for Arabic news articles. In Proceedings of the Arabic Natural Language Processing, Toulouse, France."},{"key":"ijirr.2011070104-47","unstructured":"Sembok, T. M. T., Abu Ata, B. M., & Abu Bakar, Z. (2011). A rule-based Arabic stemming algorithm. In Proceedings of the 5th European Conference on European Computing Conference."},{"key":"ijirr.2011070104-48","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2006.04.001"},{"key":"ijirr.2011070104-49","doi-asserted-by":"crossref","unstructured":"Taghva, K., Elkhoury, R., & Coombs, J. (2005). Arabic stemming without a root dictionary. Paper presented at the International Conference on Information Technology: Coding and Computing.","DOI":"10.1109\/ITCC.2005.90"},{"key":"ijirr.2011070104-50","doi-asserted-by":"crossref","unstructured":"Thabet, N. (2004). Stemming the Qur'an. In Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, Geneva, Switzerland.","DOI":"10.3115\/1621804.1621827"},{"key":"ijirr.2011070104-51","author":"H.Witten","year":"2005","journal-title":"Data mining: Practical machine learning tools and techniques"},{"key":"ijirr.2011070104-52","doi-asserted-by":"crossref","unstructured":"Wongpun, S., & Srivihok, A. (2008). Comparison of attribute selection techniques and algorithms in classifying bad behaviors of vocational education students. In Proceedings of the 2nd IEEE International Conference on Digital Ecosystems and Technologies, Australia (pp. 526-531).","DOI":"10.1109\/DEST.2008.4635213"},{"key":"ijirr.2011070104-53","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-007-0114-2"},{"key":"ijirr.2011070104-54","doi-asserted-by":"crossref","unstructured":"Wu, Y., Chang, C., & Lee, Y. (2006). A general and multi-lingual phrase chunking model based on masking method. In Proceedings of the 7th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico (pp. 144-155).","DOI":"10.1007\/11671299_17"},{"key":"ijirr.2011070104-55","doi-asserted-by":"publisher","DOI":"10.1108\/02635570910957669"},{"key":"ijirr.2011070104-56","unstructured":"Zhou, S., Ling, T. W., Guan, J., Hu, J., & Zhou, A. (2003). Fast text classification: A training-corpus pruning based approach. In Proceedings of the Eighth International Conference on Database Systems for Advanced Applications."},{"key":"ijirr.2011070104-57","doi-asserted-by":"crossref","unstructured":"Zhuang, D., Zhang, B., Yang, Q., Yan, J., Chen, Z., & Chen, Y. (2005). Efficient text classification by weighted proximal SVM. In Proceedings of the Fifth IEEE International Conference on Data Mining (pp. 538-545).","DOI":"10.1109\/ICDM.2005.56"},{"key":"ijirr.2011070104-58","unstructured":"Zubi, Z. S. (2009). Using some web content mining techniques for Arabic text classification. In Proceedings of the 8th WSEAS International Conference on Data Networks, Communications, Computers, Baltimore, MD (pp. 73-84)."}],"container-title":["International Journal of Information Retrieval Research"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=64171","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T21:14:02Z","timestamp":1654118042000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/ijirr.2011070104"}},"subtitle":["An Empirical Study"],"short-title":[],"issued":{"date-parts":[[2011,7,1]]},"references-count":59,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,7]]}},"URL":"https:\/\/doi.org\/10.4018\/ijirr.2011070104","relation":{},"ISSN":["2155-6377","2155-6385"],"issn-type":[{"value":"2155-6377","type":"print"},{"value":"2155-6385","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,7,1]]}}}