{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T18:26:29Z","timestamp":1648578389112},"reference-count":0,"publisher":"IOS Press","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,8]]},"abstract":"<jats:p>In this paper, we build a new dataset UIT-ViON (Vietnamese Online Newspaper) collected from well-known online newspapers in Vietnamese. We collect, process, and create the dataset, then experiment with different machine learning models. In particular, we propose an open-domain, large-scale, and high-quality dataset consisting of 260,000 textual data points annotated with multiple labels for evaluating Vietnamese short text classification. In addition, we present the proposed approach using transformer-based learning (PhoBERT) for Vietnamese short text classification on the dataset, which outperforms traditional machine learning (Naive Bayes and Logistic Regression) and deep learning (Text-CNN and LSTM). As a result, the proposed approach achieves the F1-score of 80.62%. This is a positive result and a premise for developing an automatic news classification system. The study is proposed to significantly save time, costs, and human resources and make it easier for readers to find news related to their interesting topics. In future, we will propose solutions to improve the quality of the dataset and improve the performance of classification models.<\/jats:p>","DOI":"10.3233\/faia210036","type":"book-chapter","created":{"date-parts":[[2021,9,16]],"date-time":"2021-09-16T09:22:55Z","timestamp":1631784175000},"source":"Crossref","is-referenced-by-count":0,"title":["An Empirical Investigation of Online News Classification on an Open-Domain, Large-Scale and High-Quality Dataset in Vietnamese"],"prefix":"10.3233","author":[{"given":"Khanh Quoc","family":"Tran","sequence":"first","affiliation":[{"name":"University of Information Technology, Ho Chi Minh city, Vietnam"},{"name":"Vietnam National University, Ho Chi Minh City, Vietnam Email: 18520908@gm.uit.edu.vn, 18521227@gm.uit.edu.vn, 8520938@gm.uit.edu.vn, 18520426@gm.uit.edu.vn, 8521062@gm.uit.edu.vn, kietnv@uit.edu.vn"}]},{"given":"Phap Ngoc","family":"Trinh","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh city, Vietnam"},{"name":"Vietnam National University, Ho Chi Minh City, Vietnam Email: 18520908@gm.uit.edu.vn, 18521227@gm.uit.edu.vn, 8520938@gm.uit.edu.vn, 18520426@gm.uit.edu.vn, 8521062@gm.uit.edu.vn, kietnv@uit.edu.vn"}]},{"given":"Khoa Nguyen-Anh","family":"Tran","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh city, Vietnam"},{"name":"Vietnam National University, Ho Chi Minh City, Vietnam Email: 18520908@gm.uit.edu.vn, 18521227@gm.uit.edu.vn, 8520938@gm.uit.edu.vn, 18520426@gm.uit.edu.vn, 8521062@gm.uit.edu.vn, kietnv@uit.edu.vn"}]},{"given":"An Tran-Hoai","family":"Le","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh city, Vietnam"},{"name":"Vietnam National University, Ho Chi Minh City, Vietnam Email: 18520908@gm.uit.edu.vn, 18521227@gm.uit.edu.vn, 8520938@gm.uit.edu.vn, 18520426@gm.uit.edu.vn, 8521062@gm.uit.edu.vn, kietnv@uit.edu.vn"}]},{"given":"Luan Van","family":"Ha","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh city, Vietnam"},{"name":"Vietnam National University, Ho Chi Minh City, Vietnam Email: 18520908@gm.uit.edu.vn, 18521227@gm.uit.edu.vn, 8520938@gm.uit.edu.vn, 18520426@gm.uit.edu.vn, 8521062@gm.uit.edu.vn, kietnv@uit.edu.vn"}]},{"given":"Kiet Van","family":"Nguyen","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh city, Vietnam"},{"name":"Vietnam National University, Ho Chi Minh City, Vietnam Email: 18520908@gm.uit.edu.vn, 18521227@gm.uit.edu.vn, 8520938@gm.uit.edu.vn, 18520426@gm.uit.edu.vn, 8521062@gm.uit.edu.vn, kietnv@uit.edu.vn"}]}],"member":"7437","container-title":["Frontiers in Artificial Intelligence and Applications","New Trends in Intelligent Software Methodologies, Tools and Techniques"],"original-title":[],"link":[{"URL":"https:\/\/ebooks.iospress.nl\/pdf\/doi\/10.3233\/FAIA210036","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,10,25]],"date-time":"2021-10-25T13:29:07Z","timestamp":1635168547000},"score":1,"resource":{"primary":{"URL":"https:\/\/ebooks.iospress.nl\/doi\/10.3233\/FAIA210036"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/faia210036","relation":{},"ISSN":["0922-6389","1879-8314"],"issn-type":[{"value":"0922-6389","type":"print"},{"value":"1879-8314","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,8]]}}}