{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T03:46:23Z","timestamp":1762659983832,"version":"build-2065373602"},"reference-count":64,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2020,11,3]],"date-time":"2020-11-03T00:00:00Z","timestamp":1604361600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"<jats:p>Active learning is the category of partially supervised algorithms that is differentiated by its strategy to combine both the predictive ability of a base learner and the human knowledge so as to exploit adequately the existence of unlabeled data. Its ambition is to compose powerful learning algorithms which otherwise would be based only on insufficient labelled samples. Since the latter kind of information could raise important monetization costs and time obstacles, the human contribution should be seriously restricted compared with the former. For this reason, we investigate the use of the Logitboost wrapper classifier, a popular variant of ensemble algorithms which adopts the technique of boosting along with a regression base learner based on Model trees into 3 different active learning query strategies. We study its efficiency against 10 separate learners under a well-described active learning framework over 91 datasets which have been split to binary and multi-class problems. We also included one typical Logitboost variant with a separate internal regressor for discriminating the benefits of adopting a more accurate regression tree than one-node trees, while we examined the efficacy of one hyperparameter of the proposed algorithm. Since the application of the boosting technique may provide overall less biased predictions, we assume that the proposed algorithm, named as Logitboost(M5P), could provide both accurate and robust decisions under active learning scenarios that would be beneficial on real-life weakly supervised classification tasks. Its smoother weighting stage over the misclassified cases during training as well as the accurate behavior of M5P are the main factors that lead towards this performance. Proper statistical comparisons over the metric of classification accuracy verify our assumptions, while adoption of M5P instead of weak decision trees was proven to be more competitive for the majority of the examined problems. We present our results through appropriate summarization approaches and explanatory visualizations, commenting our results per case.<\/jats:p>","DOI":"10.3390\/informatics7040050","type":"journal-article","created":{"date-parts":[[2020,11,3]],"date-time":"2020-11-03T09:09:32Z","timestamp":1604394572000},"page":"50","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Investigation of Combining Logitboost(M5P) under Active Learning Classification Tasks"],"prefix":"10.3390","volume":"7","author":[{"given":"Vangjel","family":"Kazllarof","sequence":"first","affiliation":[{"name":"Department of Mathematics, University of Patras, 26500 Patras, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5307-6186","authenticated-orcid":false,"given":"Stamatis","family":"Karlos","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of Patras, 26500 Patras, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2247-3082","authenticated-orcid":false,"given":"Sotiris","family":"Kotsiantis","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of Patras, 26500 Patras, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2020,11,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1950","DOI":"10.14778\/3229863.3236232","article-title":"The return of jedAI: End-to-End Entity Resolution for Structured and Semi-Structured Data","volume":"11","author":"Papadakis","year":"2018","journal-title":"Proc. VLDB Endow."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"32","DOI":"10.3390\/informatics1010032","article-title":"Using Collaborative Tagging for Text Classification: From Text Classification to Opinion Mining","volume":"1","author":"Charton","year":"2013","journal-title":"Informatics"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"105895","DOI":"10.1016\/j.asoc.2019.105895","article-title":"Value-added tax fraud detection with scalable anomaly detection techniques","volume":"86","author":"Vanhoeyveld","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Masood, A., and Al-Jumaily, A. (2017, January 24\u201326). Semi advised learning and classification algorithm for partially labeled skin cancer data analysis. Proceedings of the 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), Nanjing, China.","DOI":"10.1109\/ISKE.2017.8258767"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Haseeb, M., Hussain, H.I., \u015alusarczyk, B., and Jermsittiparsert, K. (2019). Industry 4.0: A Solution towards Technology Challenges of Sustainable Business Performance. Soc. Sci., 8.","DOI":"10.3390\/socsci8050154"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.patrec.2013.10.017","article-title":"Pattern classification and clustering: A review of partially supervised learning approaches","volume":"37","author":"Schwenker","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-015-0844-1","article-title":"Weakly supervised learning of biomedical information extraction from curated data","volume":"17","author":"Jain","year":"2016","journal-title":"BMC Bioinform."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1093\/nsr\/nwx106","article-title":"A brief introduction to weakly supervised learning","volume":"5","author":"Zhou","year":"2017","journal-title":"Natl. Sci. Rev."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1007\/s10676-019-09516-z","article-title":"Quarantining online hate speech: Technical and ethical perspectives","volume":"22","author":"Ullmann","year":"2019","journal-title":"Ethic- Inf. Technol."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Settles, B. (2012). Active Learning, Morgan & Claypool Publishers.","DOI":"10.1007\/978-3-031-01560-1"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Karlos, S., Fazakis, N., Kotsiantis, S.B., Sgarbas, K., and Karlos, G. (2017). Self-Trained Stacking Model for Semi-Supervised Learning. Int. J. Artif. Intell. Tools, 26.","DOI":"10.1142\/S0218213017500014"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1109\/MSP.2017.2699358","article-title":"Advanced Data Exploitation in Speech Analysis: An overview","volume":"34","author":"Zhang","year":"2017","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_13","first-page":"24","article-title":"Semi-supervised and Active Learning in Video Scene Classification from Statistical Features","volume":"Volume 2192","author":"Sabata","year":"2018","journal-title":"IAL@PKDD\/ECML"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Karlos, S., Kanas, V.G., Aridas, C., Fazakis, N., and Kotsiantis, S. (2019, January 15\u201317). Combining Active Learning with Self-Train Algorithm for Classification of Multimodal Problems. Proceedings of the 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), Patras, Greece.","DOI":"10.1109\/IISA.2019.8900724"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Senthilnath, J., Varia, N., Dokania, A., Anand, G., and Benediktsson, J.A. (2020). Deep TEC: Deep Transfer Learning with Ensemble Classifier for Road Extraction from UAV Imagery. Remote Sens., 12.","DOI":"10.3390\/rs12020245"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Menze, B.H., Kelm, B.M., Splitthoff, D.N., Koethe, U., and Hamprecht, F.A. (2011). On Oblique Random Forests, Springer Science and Business Media LLC.","DOI":"10.1007\/978-3-642-23783-6_29"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1214\/aos\/1016218223","article-title":"Additive logistic regression: A statistical view of boosting","volume":"28","author":"Friedman","year":"2000","journal-title":"Ann. Stat."},{"key":"ref_18","unstructured":"Freund, Y., and Schapire, R.E. (1996). Experiments with a New Boosting Algorithm. ICML, Morgan Kaufmann. Available online: http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.133.1040."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1016\/j.ins.2012.11.015","article-title":"Let us know your decision: Pool-based active training of a generative classifier with the selection strategy 4DS","volume":"230","author":"Reitmaier","year":"2013","journal-title":"Inf. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1007\/s10618-016-0460-3","article-title":"Evidence-based uncertainty sampling for active learning","volume":"31","author":"Sharma","year":"2016","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Grau, I., Sengupta, D., Lorenzo, M.M.G., and Nowe, A. (2020, June 07). An Interpretable Semi-Supervised Classifier Using Two Different Strategies for Amended Self-Labeling 2020. Available online: http:\/\/arxiv.org\/abs\/2001.09502.","DOI":"10.1109\/FUZZ48607.2020.9177549"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"825","DOI":"10.1007\/s00500-005-0011-0","article-title":"Induction of descriptive fuzzy classifiers with the Logitboost algorithm","volume":"10","author":"Otero","year":"2005","journal-title":"Soft Comput."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Burduk, R., and Bo\u017cejko, W. (2019). Modified Score Function and Linear Weak Classifiers in LogitBoost Algorithm. Advances in Intelligent Systems and Computing, Springer.","DOI":"10.1007\/978-3-030-31254-1_7"},{"key":"ref_24","first-page":"53","article-title":"Logitboost of simple bayesian classifier","volume":"29","author":"Kotsiantis","year":"2005","journal-title":"Informatica"},{"key":"ref_25","unstructured":"Leathart, T., Frank, E., Holmes, G., Pfahringer, B., Noh, Y.-K., and Zhang, M.-L. (2017, January 15\u201317). Probability Calibration Trees. Proceedings of the Ninth Asian Conference on Machine Learning, Seoul, Korea. Available online: http:\/\/proceedings.mlr.press\/v77\/leathart17a\/leathart17a.pdf."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/j.csda.2017.03.010","article-title":"LogitBoost autoregressive networks","volume":"112","author":"Goessling","year":"2017","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_27","unstructured":"Li, P. (2012). Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1007\/s10994-014-5434-3","article-title":"An improved multiclass LogitBoost using adaptive-one-vs-one","volume":"97","author":"Reid","year":"2014","journal-title":"Mach. Learn."},{"key":"ref_29","first-page":"343","article-title":"Learning with continuous classes","volume":"92","author":"Quinlan","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.ijsbe.2014.12.002","article-title":"Modeling compressive strength of recycled aggregate concrete by Artificial Neural Network, Model Tree and Non-linear Regression","volume":"3","author":"Deshpande","year":"2014","journal-title":"Int. J. Sustain. Built Environ."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1007\/978-3-319-23983-5_14","article-title":"Self-Train LogitBoost for Semi-supervised Learning\u201d in Engineering Applications of Neural Networks","volume":"Volume 517","author":"Karlos","year":"2015","journal-title":"Communications in Computer and Information Science"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Iba, W., and Langley, P. (1992, January 1\u20133). Induction of One-Level Decision Trees (Decision Stump). Proceedings of the Ninth International Conference on Machine Learning, Aberdeen, Scotland.","DOI":"10.1016\/B978-1-55860-247-2.50035-8"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1145\/1656274.1656278","article-title":"The WEKA data mining software","volume":"11","author":"Hall","year":"2009","journal-title":"ACM SIGKDD Explor. Newsl."},{"key":"ref_34","unstructured":"Fung, G. (2011). Active Learning from Crowds. ICML, Springer."},{"key":"ref_35","unstructured":"Aggarwal, C.C. (2015). Data Classification: Algorithms and Applications, CRC Press."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1016\/j.procs.2015.04.142","article-title":"An Active Learning Framework for Human Hand Sign Gestures and Handling Movement Epenthesis Using Enhanced Level Building Approach","volume":"48","author":"Elakkiya","year":"2015","journal-title":"Procedia Comput. Sci."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Pozo, M., Chiky, R., Meziane, F., and M\u00e9tais, E. (2018). Exploiting Past Users\u2019 Interests and Predictions in an Active Learning Method for Dealing with Cold Start in Recommender Systems. Informatics, 5.","DOI":"10.20944\/preprints201803.0253.v1"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Souza, R.R., Dorn, A., Piringer, B., and Wandl-Vogt, E. (2019). Towards A Taxonomy of Uncertainties: Analysing Sources of Spatio-Temporal Uncertainty on the Example of Non-Standard German Corpora. Informatics, 6.","DOI":"10.3390\/informatics6030034"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Nguyen, V.-L., Destercke, S., and H\u00fcllermeier, E. (2019). Epistemic Uncertainty Sampling. Lecture Notes in Computer Science, Springer.","DOI":"10.1007\/978-3-030-33778-0_7"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1007\/s10994-006-9449-2","article-title":"An analysis of diversity measures","volume":"65","author":"Tang","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Olson, D.L., and Wu, D. (2017). Regression Tree Models. Predictive Data Mining Models, Springer.","DOI":"10.1007\/978-981-10-2543-3"},{"key":"ref_42","unstructured":"Wang, Y., and Witten, I.H. (1997). Inducing Model Trees for Continuous Classes. European Conference on Machine Learning, Springer."},{"key":"ref_43","first-page":"1","article-title":"Comparative Study of M5 Model Tree and Artificial Neural Network in Estimating Reference Evapotranspiration Using MODIS Products","volume":"2014","author":"Alipour","year":"2014","journal-title":"J. Clim."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.conbuildmat.2015.06.055","article-title":"Predicting modulus elasticity of recycled aggregate concrete using M5\u2032 model tree algorithm","volume":"94","author":"Behnood","year":"2015","journal-title":"Constr. Build. Mater."},{"key":"ref_45","unstructured":"Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., and Yu, B. The Boosting Approach to Machine Learning: An Overview. Nonlinear Estimation and Classification, Springer. Lecture Notes in Statistics."},{"key":"ref_46","unstructured":"Linchman, M. (2020, October 30). UCI Machine Learning Repository. Available online: http:\/\/archive.ics.uci.edu\/ml\/."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1774","DOI":"10.1109\/TNNLS.2017.2673241","article-title":"Efficient kNN Classification with Different Numbers of Nearest Neighbors","volume":"29","author":"Zhang","year":"2017","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/s10462-011-9272-4","article-title":"Decision trees: A recent overview","volume":"39","author":"Kotsiantis","year":"2013","journal-title":"Artif. Intell. Rev."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1007\/s10994-005-0466-3","article-title":"Logistic Model Trees","volume":"59","author":"Landwehr","year":"2005","journal-title":"Mach. Learn."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1007\/s10618-009-0131-8","article-title":"FURIA: An algorithm for unordered fuzzy rule induction","volume":"19","year":"2009","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1016\/j.ins.2019.08.071","article-title":"Class-specific attribute value weighting for Naive Bayes","volume":"508","author":"Zhang","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_52","first-page":"95-1","article-title":"JCLAL: A Java Framework for Active Learning","volume":"17","author":"Reyes","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_53","unstructured":"Quinlan, J.R. (1996). Bagging, Boosting, and C4.5, AAAI Press."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1007\/s13042-012-0094-8","article-title":"Performance of global\u2013local hybrid ensemble versus boosting and bagging ensembles","volume":"4","author":"Baumgartner","year":"2012","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Eisinga, R., Heskes, T., Pelzer, B., and Grotenhuis, M.T. (2017). Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers. BMC Bioinform., 18.","DOI":"10.1186\/s12859-017-1486-2"},{"key":"ref_56","unstructured":"Hollander, M., Wolfe, D.A., and Chicken, E. (2013). Nonparametric Statistical Methods. Simulation and the Monte Carlo Method, Wiley. [3rd ed.]."},{"key":"ref_57","first-page":"287","article-title":"Active learning: An empirical study of common baselines","volume":"31","author":"Sharma","year":"2016","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"3535","DOI":"10.1007\/s10489-020-01732-1","article-title":"A boosting Self-Training Framework based on Instance Generation with Natural Neighbors for K Nearest Neighbor","volume":"50","author":"Li","year":"2020","journal-title":"Appl. Intell."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"26190","DOI":"10.1109\/ACCESS.2017.2766844","article-title":"A LogitBoost-Based Algorithm for Detecting Known and Unknown Web Attacks","volume":"5","author":"Kamarudin","year":"2017","journal-title":"IEEE Access"},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1016\/j.eswa.2015.09.026","article-title":"Data heterogeneity consideration in semi-supervised learning","volume":"45","author":"Zhao","year":"2016","journal-title":"Expert Syst. Appl."},{"key":"ref_61","unstructured":"Platanios, E.A., Kapoor, A., and Horvitz, E. (2017). Active Learning amidst Logical Knowledge. arXiv."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.neucom.2017.05.105","article-title":"Empirical investigation of active learning strategies","volume":"326\u2013327","author":"Santos","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1016\/j.ins.2017.06.038","article-title":"On-line active learning: A new paradigm to improve practical useability of data stream modeling methods","volume":"415","author":"Lughofer","year":"2017","journal-title":"Inf. Sci."}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/7\/4\/50\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:28:42Z","timestamp":1760178522000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/7\/4\/50"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,3]]},"references-count":64,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["informatics7040050"],"URL":"https:\/\/doi.org\/10.3390\/informatics7040050","relation":{},"ISSN":["2227-9709"],"issn-type":[{"type":"electronic","value":"2227-9709"}],"subject":[],"published":{"date-parts":[[2020,11,3]]}}}