{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:31:40Z","timestamp":1777703500062,"version":"3.51.4"},"reference-count":26,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2016,12,23]],"date-time":"2016-12-23T00:00:00Z","timestamp":1482451200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2017,1,30]]},"abstract":"<jats:p>The imbalanced data problem occurs when the number of representative instances for classes of interest is much lower than for other classes. The influence of imbalanced data on classification performance has been discussed in some previous research as a challenge to be studied. In this paper, we propose a method to solve the imbalanced data problem by focusing on preprocessing, including: i) sampling techniques (i.e., under-sampling, over-sampling, and hybrid-sampling) and ii) the instance weighting method to increase the number of features in minority classes and to reduce comprehensive coverage in majority classes. The experimental results show that the noisy data is reduced, making a smaller sized dataset, and training time decreases significantly. Moreover, distinct properties of each class are examined effectively. Refined data is used as input for Naive Bayes and support vector machine classifiers for the targets of the training process. The proposed methods are evaluated based on the number of non-geotagged resources that are labeled correctly with their geo-locations. In comparison with previous research, the proposed method achieves accuracy of 84%, whereas previous results were 75%.<\/jats:p>","DOI":"10.3233\/jifs-169140","type":"journal-article","created":{"date-parts":[[2016,12,23]],"date-time":"2016-12-23T17:06:47Z","timestamp":1482512807000},"page":"1437-1448","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":4,"title":["Handling imbalanced classification problem: A case study on social media\u00a0datasets"],"prefix":"10.1177","volume":"32","author":[{"given":"Tuong Tri","family":"Nguyen","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Yeungnam University, Gyeongsan, South Korea"}]},{"given":"Dosam","family":"Hwang","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Yeungnam University, Gyeongsan, South Korea"}]},{"given":"Jason J.","family":"Jung","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Chung-Ang University, Seoul, South Korea"}]}],"member":"179","published-online":{"date-parts":[[2016,12,23]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-21219-2_1"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2016.02.006"},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2010.12.016"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11036-014-0557-0"},{"key":"e_1_3_3_6_2","first-page":"2293","article-title":"MSVMpack: A multi-class support vector machine package","volume":"12","author":"Lauer F.","year":"2011","unstructured":"LauerF. and GuermeurY., MSVMpack: A multi-class support vector machine package, The Journal of Machine Learning Research12 (2011), 2293\u20132296.","journal-title":"The Journal of Machine Learning Research"},{"key":"e_1_3_3_7_2","article-title":"An empirical study of the Naive Bayes classifier","volume":"3","author":"Rish I.","year":"2001","unstructured":"RishI., An empirical study of the Naive Bayes classifier, IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Vol. 3. No. 22. IBM New York, 2001.","journal-title":"IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2014.08.051"},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/bxr102"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.3634"},{"key":"e_1_3_3_11_2","unstructured":"WestonJ. and WatkinsC. Multi-class support vector machines Technical Report CSD-TR-98-04 Department of Computer Science Royal Holloway University of London 1998."},{"key":"e_1_3_3_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2010.04.004"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2012.07.010"},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2015.11.013"},{"key":"e_1_3_3_15_2","doi-asserted-by":"crossref","first-page":"851","DOI":"10.1145\/1835449.1835648","author":"Clements M.","year":"2010","unstructured":"ClementsM., SerdyukovP., de VriesA.P. and ReindersM.J., Using flickr geotags to predict user travel behaviour, In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, SIGIR \u201910, 2010, pp. 851\u2013852.","journal-title":"Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, SIGIR \u201910"},{"issue":"4","key":"e_1_3_3_16_2","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1109\/TSMCC.2011.2161285","article-title":"A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches","volume":"42","author":"Galar M.","year":"2012","unstructured":"GalarM., FernandezA., BarrenecheaE., BustinceH. and HerreraF., A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on42(4) (2012), 463\u2013484.","journal-title":"Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on"},{"issue":"2","key":"e_1_3_3_17_2","first-page":"285","article-title":"Processing inconsistency of knowledge on semantic level","volume":"11","author":"Nguyen N.T.","year":"2005","unstructured":"NguyenN.T., Processing inconsistency of knowledge on semantic level, Journalof Universal Computer Science11(2) (2005), 285\u2013302.","journal-title":"Journalof Universal Computer Science"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compenvurbsys.2013.11.006"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0034-4257(97)00083-7"},{"key":"e_1_3_3_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/s40595-013-0004-3"},{"key":"e_1_3_3_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2015.08.005"},{"key":"e_1_3_3_22_2","first-page":"579","article-title":"Travel route recommendation using geotags in photo sharing sites","author":"Kurashima T.","year":"2010","unstructured":"KurashimaT., IwataT., IrieG. and FujimuraK., Travel route recommendation using geotags in photo sharing sites. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, ACM, 2010, pp. 579\u2013588.","journal-title":"Proceedings of the 19th ACM International Conference on Information and Knowledge Management, ACM"},{"key":"e_1_3_3_23_2","first-page":"357","volume-title":"Proceedings of the 8th International Symposium on Intelligent Distributed Computing, IDC 2014","author":"Nguyen T.T.","year":"2014","unstructured":"NguyenT.T., HwangD. and JungJ.J., Social tagging analytics for processing unlabeled resources: A case study on non-geotagged photos. In: Proceedings of the 8th International Symposium on Intelligent Distributed Computing, IDC 2014, Madrid, Spain, 2014, pp. 357\u2013367."},{"key":"e_1_3_3_24_2","doi-asserted-by":"publisher","DOI":"10.2298\/CSIS141015091T"},{"key":"e_1_3_3_25_2","doi-asserted-by":"publisher","DOI":"10.1142\/S0218001409007326"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSMC.2008.4811259"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1198\/016214504000000098"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-169140","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-169140","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-169140","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:38:51Z","timestamp":1777455531000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-169140"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,12,23]]},"references-count":26,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2017,1,30]]}},"alternative-id":["10.3233\/JIFS-169140"],"URL":"https:\/\/doi.org\/10.3233\/jifs-169140","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,12,23]]}}}