{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T11:15:27Z","timestamp":1780053327703,"version":"3.54.0"},"reference-count":54,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2023,1,9]],"date-time":"2023-01-09T00:00:00Z","timestamp":1673222400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Much news is available online, and not all is categorized. A few researchers have carried out work on news classification in the past, and most of the work focused on fake news identification. Most of the work performed on news categorization is carried out on a benchmark dataset. The problem with the benchmark dataset is that model trained with it is not applicable in the real world as the data are pre-organized. This study used machine learning (ML) techniques to categorize online news articles as these techniques are cheaper in terms of computational needs and are less complex. This study proposed the hyperparameter-optimized support vector machines (SVM) to categorize news articles according to their respective category. Additionally, five other ML techniques, Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), K-Nearest Neighbor (KNN), and Na\u00efve Bayes (NB), were optimized for comparison for the news categorization task. The results showed that the optimized SVM model performed better than other models, while without optimization, its performance was worse than other ML models.<\/jats:p>","DOI":"10.3390\/computers12010016","type":"journal-article","created":{"date-parts":[[2023,1,9]],"date-time":"2023-01-09T05:57:14Z","timestamp":1673243834000},"page":"16","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":51,"title":["Topic Classification of Online News Articles Using Optimized Machine Learning Models"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9448-2896","authenticated-orcid":false,"given":"Shahzada","family":"Daud","sequence":"first","affiliation":[{"name":"Department of Computer Science, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan 64200, Pakistan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Muti","family":"Ullah","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan 64200, Pakistan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0101-0329","authenticated-orcid":false,"given":"Amjad","family":"Rehman","sequence":"additional","affiliation":[{"name":"Artificial Intelligence & Data Analytics Lab (AIDA), CCIS Prince Sultan University, Riyadh 11586, Saudi Arabia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Tanzila","family":"Saba","sequence":"additional","affiliation":[{"name":"Artificial Intelligence & Data Analytics Lab (AIDA), CCIS Prince Sultan University, Riyadh 11586, Saudi Arabia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9990-1084","authenticated-orcid":false,"given":"Robertas","family":"Dama\u0161evi\u010dius","sequence":"additional","affiliation":[{"name":"Department of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, Lithuania"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Abdul","family":"Sattar","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan 64200, Pakistan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,9]]},"reference":[{"key":"ref_1","first-page":"718","article-title":"Determinants of News Content","volume":"13","author":"Karlsson","year":"2012","journal-title":"J. Stud."},{"key":"ref_2","unstructured":"Mitchell, A., and Rosenstiel, T. (2022, January 08). Navigating News Online: Where People Go, How They Get There and What Lures Them Away. PEW Research Center\u2019s Project for Excellence in Journalism. Available online: http:\/\/www.journalism.org\/2011\/05\/09\/navigatingnewsonline\/."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1179\/1743131X14Y.0000000083","article-title":"Online Persian\/Arabic script classification without contextual information","volume":"62","author":"Harouni","year":"2014","journal-title":"Imaging Sci. J."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Bakshy, E., Rosenn, I., Marlow, C., and Adamic, L. (2012, January 16\u201320). The Role of Social Networks in Information Diffusion. Proceedings of the WWW 2012: 21st World Wide Web Conference, Lyon, France.","DOI":"10.1145\/2187836.2187907"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"707","DOI":"10.1111\/j.1460-2466.2008.00410.x","article-title":"A New Era of Minimal Effects? The Changing Foundations of Political Communication","volume":"58","author":"Bennett","year":"2008","journal-title":"J. Commun."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/s10462-011-9229-7","article-title":"Off-line cursive script recognition: Current advances, comparisons and remaining problems","volume":"37","author":"Rehman","year":"2012","journal-title":"Artif. Intell. Rev."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1002\/j.1538-165X.2003.tb00406.x","article-title":"Media, Misperceptions, and the Iraq War","volume":"118","author":"Kull","year":"2003","journal-title":"Polit. Sci. Q."},{"key":"ref_8","first-page":"65","article-title":"Survey of text mining, Pattern Recognit","volume":"18","author":"Chen","year":"2005","journal-title":"Artif. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Schutze, H., Manning, C.D., and Raghavan, P. (2008). Introduction to Information Retrieval, Cambridge University Press.","DOI":"10.1017\/CBO9780511809071"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s13721-019-0209-1","article-title":"A comparative study of features selection for skin lesion detection from dermoscopic images","volume":"9","author":"Javed","year":"2020","journal-title":"Netw. Model. Anal. Health Inform. Bioinform."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Larabi-Marie-Sainte, S., Aburahmah, L., Almohaini, R., and Saba, T. (2019). Current Techniques for Diabetes Prediction: Review and Case Study. Appl. Sci., 9.","DOI":"10.3390\/app9214604"},{"key":"ref_12","first-page":"2493","article-title":"Natural Language Processing (Almost) from Scratch","volume":"12","author":"Collobert","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"486","DOI":"10.1016\/j.dsp.2011.01.016","article-title":"Performance analysis of character segmentation approach for cursive script recognition on benchmark database","volume":"21","author":"Rehman","year":"2011","journal-title":"Digit. Signal Process."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Tesfagergish, S.G., Kapo\u010di\u016bt\u0117-Dzikien\u0117, J., and Dama\u0161evi\u010dius, R. (2022). Zero-Shot Emotion Detection for Semi-Supervised Sentiment Analysis Using Sentence Transformers and Ensemble Learning. Appl. Sci., 12.","DOI":"10.3390\/app12178662"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1337","DOI":"10.1007\/s00521-014-1618-9","article-title":"Annotated comparisons of proposed preprocessing techniques for script recognition","volume":"25","author":"Saba","year":"2014","journal-title":"Neural Comput. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"429","DOI":"10.5755\/j01.itc.51.3.29907","article-title":"A Comprehensive Study of Learning Approaches for Author Gender Identification","volume":"51","author":"Dalyan","year":"2022","journal-title":"Inf. Technol. Control"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"390","DOI":"10.5755\/j01.itc.51.2.30701","article-title":"A Hotel Recommender System Based on Multi-Criteria Collaborative Filtering","volume":"51","author":"Shambour","year":"2020","journal-title":"Inf. Technol. Control"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"012043","DOI":"10.1088\/1742-6596\/1682\/1\/012043","article-title":"Intelligent recommendation of related items based on naive bayes and collaborative filtering combination model","volume":"1682","author":"Wei","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1007\/978-3-030-86979-3_37","article-title":"Deep fake recognition in tweets using text augmentation, word embeddings and deep learning","volume":"Volume 12954","author":"Tesfagergish","year":"2021","journal-title":"Computational Science and Its Applications, ICCSA 2021"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"345","DOI":"10.5755\/j01.itc.51.2.30796","article-title":"GATSum: Graph-Based Topic-Aware Abstract Text Summarization","volume":"51","author":"Jiang","year":"2022","journal-title":"Inf. Technol. Control"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"482","DOI":"10.5755\/j01.itc.49.4.26808","article-title":"Part-of-Speech Tagging via Deep Neural Networks for Northern-Ethiopic Languages","volume":"49","author":"Tesfagergish","year":"2020","journal-title":"Inf. Technol. Control"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"8839524","DOI":"10.1155\/2020\/8839524","article-title":"Text Messaging-Based Medical Diagnosis Using Natural Language Processing and Fuzzy Logic","volume":"2020","author":"Omoregbe","year":"2020","journal-title":"J. Health Eng."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"846930","DOI":"10.3389\/fdata.2022.846930","article-title":"Topic Modeling for Interpretable Text Classification from EHRs","volume":"5","author":"Rijcken","year":"2022","journal-title":"Front. Big Data"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Chang, I.-C., Horng, J.-S., Liu, C.-H., Chou, S.-F., and Yu, T.-Y. (2022). Exploration of Topic Classification in the Tourism Field with Text Mining Technology\u2014A Case Study of the Academic Journal Papers. Sustainability, 14.","DOI":"10.3390\/su14074053"},{"key":"ref_25","first-page":"521","article-title":"Sentiment analysis of lithuanian texts using deep learning methods","volume":"Volume 920","year":"2018","journal-title":"Information and Software Technologies. ICIST 2018"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Damasevicius, R., Valys, R., and Wozniak, M. (2016, January 6\u20139). Intelligent tagging of online texts using fuzzy logic. Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016, Athens, Greece.","DOI":"10.1109\/SSCI.2016.7849917"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Alhaj, Y.A., Dahou, A., Al-Qaness, M.A.A., Abualigah, L., Abbasi, A.A., Almaweri, N.A.O., Elaziz, M.A., and Dama\u0161evi\u010dius, R. (2022). A Novel Text Classification Technique Using Improved Particle Swarm Optimization: A Case Study of Arabic Language. Futur. Internet, 14.","DOI":"10.3390\/fi14070194"},{"key":"ref_28","unstructured":"Zhang, X., and LeCun, Y. (2015). Text Understanding from Scratch. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3285","DOI":"10.1007\/s00521-016-2244-5","article-title":"Fused features mining for depth-based hand gesture recognition to classify blind human communication","volume":"28","author":"Jadooki","year":"2017","journal-title":"Neural Comput. Appl."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1016\/j.eswa.2013.08.015","article-title":"Syntactic N-grams as machine learning features for natural language processing","volume":"41","author":"Sidorov","year":"2014","journal-title":"Expert Syst. Appl."},{"key":"ref_31","first-page":"29","article-title":"Using tf-idf to determine word relevance in document queries","volume":"242","author":"Ramos","year":"2003","journal-title":"Proc. First Instr. Conf. Mach. Learn."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wallach, H.M. (2006, January 25\u201329). Topic Modeling: Beyond Bag-of-Words. Proceedings of the ICML \u201906: 23rd International Conference on Machine Learning, Pittsburgh, PA, USA.","DOI":"10.1145\/1143844.1143967"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Lilleberg, J., Zhu, Y., and Zhang, Y. (2015, January 6\u20138). Support vector machines and Word2vec for text classification with semantic features. Proceedings of the 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), Beijing, China.","DOI":"10.1109\/ICCI-CC.2015.7259377"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Shuai, Q., Huang, Y., Jin, L., and Pang, L. (2018, January 12\u201314). Sentiment Analysis on Chinese Hotel Reviews with Doc2Vec and Classifiers. Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China.","DOI":"10.1109\/IAEAC.2018.8577581"},{"key":"ref_35","first-page":"895","article-title":"Classification and ranking of trending topics in twitter using tweets text","volume":"7","author":"Umakanth","year":"2020","journal-title":"J. Crit. Rev."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1145\/2347736.2347755","article-title":"A Few Useful Things to Know about Machine Learning","volume":"55","author":"Domingos","year":"2012","journal-title":"Commun. ACM"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"5195508","DOI":"10.1155\/2021\/5195508","article-title":"Vision Sensor-Based Real-Time Fire Detection in Resource-Constrained IoT Environments","volume":"2021","author":"Yar","year":"2021","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Dilrukshi, I., and De Zoysa, K. (2013, January 11\u201315). Twitter news classification: Theoretical and practical comparison of SVM against Naive Bayes algorithms. Proceedings of the 2013 International Conference on Advances in ICT for Emerging Regions (ICTer), Colombo, Sri Lanka.","DOI":"10.1109\/ICTer.2013.6761192"},{"key":"ref_39","unstructured":"Bun, K.K., and Ishizuka, M. (2002, January 14). Topic extraction from news archive using TF*PDF algorithm. Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002, Singapore."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Kapusta, J., and Obonya, J. (2020). Improvement of Misleading and Fake News Classification for Flective Languages by Morphological Group Analysis. Informatics, 7.","DOI":"10.3390\/informatics7010004"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Li, Y., Wang, X., and Xu, P. (2018). Chinese Text Classification Model Based on Deep Learning. Futur. Internet, 10.","DOI":"10.3390\/fi10110113"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Gao, X., Zhang, W., Liu, S., and Zhang, Y. (2018). A Bi-Directional LSTM-CNN Model with Attention for Aspect-Level Text Classification. Futur. Internet, 10.","DOI":"10.3390\/fi10120116"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/978-3-540-45219-5_7","article-title":"Supervised Term Weighting for Automated Text Categorization","volume":"Volume 138","author":"Sirmakessis","year":"2004","journal-title":"Text Mining and its Applications: Studies in Fuzziness and Soft Computing"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"893378","DOI":"10.3389\/fgene.2022.893378","article-title":"TextNetTopics: Text Classification Based Word Grouping as Topics and Topics\u2019 Scoring","volume":"13","author":"Yousef","year":"2022","journal-title":"Front. Genet."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"2143","DOI":"10.3233\/JIFS-211471","article-title":"The short texts classification based on neural network topic model","volume":"42","author":"Shao","year":"2022","journal-title":"J. Intell. Fuzzy Syst."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"123174","DOI":"10.1016\/j.physa.2019.123174","article-title":"Fake news detection within online social media using supervised artificial intelligence algorithms","volume":"540","author":"Ozbay","year":"2019","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"2758","DOI":"10.1016\/j.eswa.2010.08.066","article-title":"A comparative study of TF*IDF, LSI and multi-words for text classification","volume":"38","author":"Zhang","year":"2011","journal-title":"Expert Syst. Appl."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1007\/s007999900025","article-title":"A probabilistic justification for using tf \u00d7 idf term weighting in information retrieval","volume":"3","author":"Hiemstra","year":"2000","journal-title":"Int. J. Digit. Libr."},{"key":"ref_49","unstructured":"Gholamy, A., Kreinovich, V., and Kosheleva, O. (2018). Why 70\/30 or 80\/20 Relation Between Training and Testing Sets: A Pedagogical Explanation, Departmental Technical Reports (C.S.)."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Goutte, C., and Gaussier, E. (2005). A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. Advances in Information Retrieval, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-540-31865-1_25"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","article-title":"A systematic analysis of performance measures for classification tasks","volume":"45","author":"Sokolova","year":"2009","journal-title":"Inf. Process. Manag."},{"key":"ref_52","first-page":"171","article-title":"Neural computing for online Arabic handwriting recognition using hard stroke features mining","volume":"17","author":"Rehman","year":"2021","journal-title":"Int. J. Innov. Comput. Inf. Control"},{"key":"ref_53","first-page":"197","article-title":"An Intelligent Fused Approach for Face Recognition","volume":"22","author":"Meethongjan","year":"2013","journal-title":"J. Intell. Syst."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Maragheh, H.K., Gharehchopogh, F.S., Majidzadeh, K., and Sangar, A.B. (2022). A New Hybrid Based on Long Short-Term Memory Network with Spotted Hyena Optimization Algorithm for Multi-Label Text Classification. Mathematics, 10.","DOI":"10.3390\/math10030488"}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/12\/1\/16\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:04:18Z","timestamp":1760119458000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/12\/1\/16"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,9]]},"references-count":54,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,1]]}},"alternative-id":["computers12010016"],"URL":"https:\/\/doi.org\/10.3390\/computers12010016","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,9]]}}}