{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,14]],"date-time":"2025-05-14T12:01:57Z","timestamp":1747224117606,"version":"3.40.5"},"reference-count":48,"publisher":"IGI Global","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:p>Opinion mining focuses on extracting polarity information from texts. For textual term representation, different feature selection methods, e.g. term frequency (TF) or term frequency\u2013inverse document frequency (TF\u2013IDF), can yield diverse numbers of text features. In text classification, however, a selected training set may contain noisy documents (or outliers), which can degrade the classification performance. To solve this problem, instance selection can be adopted to filter out unrepresentative training documents. Therefore, this article investigates the opinion mining performance associated with feature and instance selection steps simultaneously. Two combination processes based on performing feature selection and instance selection in different orders, were compared. Specifically, two feature selection methods, namely TF and TF\u2013IDF, and two instance selection methods, namely DROP3 and IB3, were employed for comparison. The experimental results by using three Twitter datasets to develop sentiment classifiers showed that TF\u2013IDF followed by DROP3 performs the best.<\/jats:p>","DOI":"10.4018\/ijdwm.2020070109","type":"journal-article","created":{"date-parts":[[2020,5,28]],"date-time":"2020-05-28T14:30:35Z","timestamp":1590676235000},"page":"168-182","source":"Crossref","is-referenced-by-count":3,"title":["Integrating Feature and Instance Selection Techniques in Opinion Mining"],"prefix":"10.4018","volume":"16","author":[{"given":"Zi-Hung","family":"You","sequence":"first","affiliation":[{"name":"Department of Nephrology, Chiayi Branch, Taichung Veterans General Hospital, Chiayi, Taiwan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3285-2983","authenticated-orcid":true,"given":"Ya-Han","family":"Hu","sequence":"additional","affiliation":[{"name":"Department of Information Management, National Central University, Taoyuan, Taiwan & Center for Innovative Research on Aging Society (CIRAS), Chiayi, National Chung Cheng University, Taiwan & MOST AI Biomedical Research Center at National Cheng Kung University, Tainan, Taiwan"}]},{"given":"Chih-Fong","family":"Tsai","sequence":"additional","affiliation":[{"name":"Department of Information Management, National Central University, Taiwan"}]},{"given":"Yen-Ming","family":"Kuo","sequence":"additional","affiliation":[{"name":"Department of Information Management, National Chung Cheng University, Chiayi, Taiwan"}]}],"member":"2432","reference":[{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-0","DOI":"10.1145\/1361684.1361685"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-1","DOI":"10.1007\/978-3-642-37256-8_2"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-2","DOI":"10.1016\/j.eswa.2008.08.022"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-3","DOI":"10.1007\/BF00153759"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-4","DOI":"10.1016\/j.eswa.2011.09.160"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-5","DOI":"10.1145\/2502069.2502071"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-6","DOI":"10.1109\/MIS.2013.30"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-7","DOI":"10.1016\/j.knosys.2018.02.005"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-8","DOI":"10.1016\/j.aci.2017.03.001"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-9","DOI":"10.1111\/j.1083-6101.2007.00393.x"},{"key":"IJDWM.2020070109-10","first-page":"1289","article-title":"An extensive empirical study of feature selection metrics for text classification.","volume":"3","author":"G.Forman","year":"2003","journal-title":"Journal of Machine Learning Research"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-11","DOI":"10.1109\/TPAMI.2011.142"},{"unstructured":"Go, A., Bhayani, R., and Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project Report.","key":"IJDWM.2020070109-12"},{"key":"IJDWM.2020070109-13","first-page":"1157","article-title":"An introduction to variable and feature selection.","volume":"3","author":"I.Guyon","year":"2003","journal-title":"Journal of Machine Learning Research"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-14","DOI":"10.1016\/j.ipm.2016.12.002"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-15","DOI":"10.1108\/EL-02-2018-0040"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-16","DOI":"10.1016\/j.eswa.2016.05.038"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-17","DOI":"10.1016\/j.dss.2013.09.004"},{"unstructured":"Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of theInternational Joint Conference on Artificial Intelligence (Vol. 2, pp. 1137-1143). Academic Press.","key":"IJDWM.2020070109-18"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-19","DOI":"10.1016\/j.ipm.2004.08.006"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-20","DOI":"10.1016\/j.tele.2018.01.001"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-21","DOI":"10.1016\/j.dss.2009.09.003"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-22","DOI":"10.1016\/j.ejor.2007.08.008"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-23","DOI":"10.3414\/ME13-01-0027"},{"year":"1999","author":"C. D.Manning","journal-title":"Foundations of statistical natural language processing","key":"IJDWM.2020070109-24"},{"year":"1997","author":"T.Mitchell","journal-title":"Machine Learning","key":"IJDWM.2020070109-25"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-26","DOI":"10.1016\/j.ipm.2018.02.001"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-27","DOI":"10.1007\/978-3-642-13059-5_30"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-28","DOI":"10.1561\/1500000011"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-29","DOI":"10.1016\/j.ipm.2016.07.001"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-30","DOI":"10.1002\/9780470182963"},{"year":"2014","author":"J. R.Quinlan","journal-title":"C4. 5: programs for machine learning","key":"IJDWM.2020070109-31"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-32","DOI":"10.1016\/j.ipm.2016.12.004"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-33","DOI":"10.1016\/j.ipm.2015.01.005"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-34","DOI":"10.1016\/0306-4573(88)90021-0"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-35","DOI":"10.1145\/505282.505283"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-36","DOI":"10.1145\/1631144.1631148"},{"key":"IJDWM.2020070109-37","first-page":"53","article-title":"Twitter polarity classification with label propagation over lexical links and the follower graph.","author":"M.Speriosu","year":"2011","journal-title":"International Workshop on Unsupervised Learning in NLP"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-38","DOI":"10.1002\/asi.21462"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-39","DOI":"10.1016\/j.knosys.2013.10.019"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-40","DOI":"10.1016\/j.is.2013.05.001"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-41","DOI":"10.1016\/j.jss.2013.12.034"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-42","DOI":"10.1007\/978-1-4757-2440-0"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-43","DOI":"10.1016\/j.ipm.2016.08.003"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-44","DOI":"10.1023\/A:1007626913721"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-45","DOI":"10.1016\/j.knosys.2017.11.035"},{"unstructured":"Yang, Y., & Pedersen, J. O. (1997) A comparative study of feature selection in text categorization. In Proceedings of theInternational Conference on Machine Learning (pp. 412-420). Academic Press.","key":"IJDWM.2020070109-46"},{"doi-asserted-by":"publisher","key":"IJDWM.2020070109-47","DOI":"10.1145\/1007730.1007741"}],"container-title":["International Journal of Data Warehousing and Mining"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=256168","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,6]],"date-time":"2022-05-06T13:53:20Z","timestamp":1651845200000},"score":1,"resource":{"primary":{"URL":"http:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJDWM.2020070109"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2020,7]]},"references-count":48,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.4018\/ijdwm.2020070109","relation":{},"ISSN":["1548-3924","1548-3932"],"issn-type":[{"type":"print","value":"1548-3924"},{"type":"electronic","value":"1548-3932"}],"subject":[],"published":{"date-parts":[[2020,7]]}}}