{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:28:36Z","timestamp":1777854516366,"version":"3.51.4"},"reference-count":64,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2015,6,29]],"date-time":"2015-06-29T00:00:00Z","timestamp":1435536000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2016,4]]},"abstract":"<jats:p>Web page classification is an important research direction on web mining. The abundant amount of data available on the web makes it essential to develop efficient and robust models for web mining tasks. Web page classification is the process of assigning a web page to a particular predefined category based on labelled data. It serves for several other web mining tasks, such as focused web crawling, web link analysis and contextual advertising. Machine learning and data mining methods have been successfully applied for several web mining tasks, including web page classification. Multiple classifier systems are a promising research direction in machine learning, which aims to combine several classifiers by differentiating base classifiers and\/or dataset distributions so that more robust classification models can be built. This paper presents a comparative analysis of four different feature selections (correlation, consistency, information gain and chi-square-based feature selection) and four different ensemble learning methods (Boosting, Bagging, Dagging and Random Subspace) based on four different base learners (naive Bayes, K-nearest neighbour algorithm, C4.5 algorithm and FURIA algorithm). The article examines the predictive performance of ensemble methods for web page classification. The experimental results indicate that feature selection and ensemble learning can enhance the predictive performance of classifiers in web page classification. For the DMOZ-50 dataset, the highest average predictive performance (88.1%) is obtained with the combination of consistency-based feature selection with AdaBoost and naive Bayes algorithms, which is a promising result for web page classification. Experimental results indicate that Bagging and Random Subspace ensemble methods and correlation-based and consistency-based feature selection methods obtain better results in terms of accuracy rates.<\/jats:p>","DOI":"10.1177\/0165551515591724","type":"journal-article","created":{"date-parts":[[2015,6,29]],"date-time":"2015-06-29T21:45:13Z","timestamp":1435614313000},"page":"150-165","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":109,"title":["Classifier and feature set ensembles for web page classification"],"prefix":"10.1177","volume":"42","author":[{"given":"Aytu\u011f","family":"Onan","sequence":"first","affiliation":[{"name":"Celal Bayar University, Turkey"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2015,6,29]]},"reference":[{"key":"bibr1-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1142\/S0219622008003150"},{"key":"bibr2-0165551515591724","volume-title":"Data mining: Concepts and techniques","author":"Han J","year":"2011"},{"key":"bibr3-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1145\/360402.360406"},{"issue":"2","key":"bibr4-0165551515591724","volume":"5","author":"Bhatia MPS","year":"2008","journal-title":"Webology"},{"key":"bibr5-0165551515591724","doi-asserted-by":"publisher","DOI":"10.4018\/978-1-60960-818-7.ch105"},{"key":"bibr6-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1145\/1459352.1459357"},{"key":"bibr7-0165551515591724","first-page":"211","volume-title":"Genres on the web","author":"Lindemann C","year":"2011"},{"key":"bibr8-0165551515591724","first-page":"395","volume-title":"IFSA world congress","author":"Lee HM"},{"key":"bibr9-0165551515591724","first-page":"96","volume-title":"Proceedings of the 4th international workshop on web information and data management","author":"Sun A"},{"key":"bibr10-0165551515591724","first-page":"487","volume-title":"Proceedings of COMPSAC 2002","author":"Haruechaiyasak C"},{"key":"bibr11-0165551515591724","first-page":"560","volume-title":"Proceedings of 15th IEEE international conference on tools with artificial intelligence","author":"Wang Y"},{"key":"bibr12-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(02)00022-5"},{"key":"bibr13-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44831-4_12"},{"key":"bibr14-0165551515591724","first-page":"241","volume-title":"Proceedings of the 2004 IEEE international conference on information reuse and integration","author":"Qi D"},{"key":"bibr15-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2003.03.003"},{"key":"bibr16-0165551515591724","first-page":"1","volume-title":"Transactions on rough sets II","author":"An A","year":"2005"},{"key":"bibr17-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30217-9_110"},{"key":"bibr18-0165551515591724","first-page":"916","volume-title":"PDCAT 2005","author":"Yi G"},{"key":"bibr19-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2005.09.079"},{"key":"bibr20-0165551515591724","first-page":"33","volume-title":"International conference on computational intelligence and multimedia applications","author":"Devi MI"},{"key":"bibr21-0165551515591724","first-page":"84","volume-title":"Proceedings of recent advances in Slavonic natural language processing","author":"Materna J"},{"key":"bibr22-0165551515591724","first-page":"193","volume-title":"Proceedings of international conference on computational intelligence and security","author":"Zhang J"},{"key":"bibr23-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2007.09.008"},{"key":"bibr24-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2010.08.126"},{"issue":"1","key":"bibr25-0165551515591724","doi-asserted-by":"crossref","first-page":"171","DOI":"10.3233\/FUN-2007-761-211","volume":"76","author":"Saha S","year":"2007","journal-title":"Fundamenta Informaticae"},{"issue":"1","key":"bibr26-0165551515591724","first-page":"1625","volume":"76","author":"Zhong S","year":"2011","journal-title":"Journal of Networks"},{"issue":"4","key":"bibr27-0165551515591724","first-page":"5614","volume":"5","author":"Choudhary R","year":"2014","journal-title":"International Journal of Computer Science and Information Technologies"},{"key":"bibr28-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btm344"},{"key":"bibr29-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1201\/9781584888796.ch19"},{"key":"bibr30-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2010.11.023"},{"key":"bibr31-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2013.08.002"},{"key":"bibr32-0165551515591724","first-page":"1296","volume":"20","author":"G\u00fcnal S","year":"2012","journal-title":"Turkish Journal of Electrical Engineering and Computer Sciences"},{"key":"bibr33-0165551515591724","doi-asserted-by":"crossref","unstructured":"Sara\u00e7 E, \u00d6zel SA. An ant colony optimization based feature selection for web page classification. The Scientific World Journal 2014; Article ID 649260.","DOI":"10.1155\/2014\/649260"},{"key":"bibr34-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2013.11.024"},{"key":"bibr35-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1109\/MCAS.2006.1688199"},{"key":"bibr36-0165551515591724","doi-asserted-by":"publisher","DOI":"10.3233\/IDA-2010-0421"},{"key":"bibr37-0165551515591724","unstructured":"Hall MA. Correlation-based feature selection for machine learning. PhD Thesis, University of Waikato, New Zealand, 1999."},{"key":"bibr38-0165551515591724","first-page":"319","volume-title":"Proceedings of the thirteenth international conference on machine learning","author":"Liu H"},{"key":"bibr39-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1245283"},{"key":"bibr40-0165551515591724","first-page":"412","volume-title":"Proceedings of the fourteenth international conference on machine learning","author":"Yang Y"},{"key":"bibr41-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1145\/505282.505283"},{"key":"bibr42-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-34747-9_18"},{"key":"bibr43-0165551515591724","first-page":"1","volume-title":"Proceedings of the 8th international conference on signal processing","author":"Zhan W"},{"key":"bibr44-0165551515591724","unstructured":"Colas FPR. Data mining scenarios for the discovery of subtypes and the comparison of algorithms. PhD Thesis, Leiden University, Netherlands, 2009."},{"key":"bibr45-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/BF00153759"},{"key":"bibr46-0165551515591724","volume-title":"Data mining for business intelligence: Concepts, techniques and applications","author":"Shmueli G","year":"2010"},{"key":"bibr47-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/B978-0-08-050058-4.50007-3"},{"key":"bibr48-0165551515591724","first-page":"3","volume-title":"The handbook of data mining","author":"Gehrke J","year":"2003"},{"key":"bibr49-0165551515591724","first-page":"66","volume":"4","author":"Quinlan JR","year":"1996","journal-title":"Journal of Artificial Intelligence"},{"key":"bibr50-0165551515591724","first-page":"105","volume-title":"Proceedings of the third IEEE international conference on computer science and information technology","author":"Niuniu X"},{"key":"bibr51-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-009-0131-8"},{"key":"bibr52-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45014-9_1"},{"key":"bibr53-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-009-9124-7"},{"key":"bibr54-0165551515591724","first-page":"325","volume-title":"Proceedings of the thirteenth international conference (ICML\u201996)","author":"Freund Y"},{"key":"bibr55-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1002\/9781118914564"},{"key":"bibr56-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/BF00058655"},{"key":"bibr57-0165551515591724","first-page":"367","volume-title":"Proceedings of the 14th international conference on machine learning","author":"Ting KM"},{"key":"bibr58-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1109\/34.709601"},{"key":"bibr59-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33015-5_2"},{"key":"bibr60-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74825-0_11"},{"key":"bibr61-0165551515591724","unstructured":"DMOZ Open Directory Project Dataset, http:\/\/www.unicauca.edu.co\/~ccobos\/wdc\/wdc.htm"},{"key":"bibr62-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0094137"},{"key":"bibr63-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2014.05.047"},{"key":"bibr64-0165551515591724","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-18473-9_19"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515591724","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551515591724","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515591724","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:09:13Z","timestamp":1777504153000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551515591724"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,6,29]]},"references-count":64,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,4]]}},"alternative-id":["10.1177\/0165551515591724"],"URL":"https:\/\/doi.org\/10.1177\/0165551515591724","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,6,29]]}}}