{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T14:47:33Z","timestamp":1770043653170,"version":"3.49.0"},"reference-count":17,"publisher":"SAGE Publications","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IFS"],"published-print":{"date-parts":[[2021,8,11]]},"abstract":"<jats:p>Machine Reading Comprehension has attracted significant interest in research on natural language understanding, and large-scale datasets and neural network-based methods have been developed for this task. However, most developments of resources and methods in machine reading comprehension have been investigated using two resource-rich languages, English and Chinese. This article proposes a system called ViReader for open-domain machine reading comprehension in Vietnamese by using Wikipedia as the textual knowledge source, where the answer to any particular question is a textual span derived directly from texts on Vietnamese Wikipedia. Our system combines a sentence retriever component, based on techniques of information retrieval to extract the relevant sentences, with a transfer learning-based answer extractor trained to predict answers based on Wikipedia texts. Experiments on multiple datasets for machine reading comprehension in Vietnamese and other languages demonstrate that (1) our ViReader system is highly competitive with prevalent machine learning-based systems, and (2) multi-task learning by using a combination consisting of the sentence retriever and answer extractor is an end-to-end reading comprehension system. The sentence retriever component of our proposed system retrieves the sentences that are most likely to provide the answer response to the given question. The transfer learning-based answer extractor then reads the document from which the sentences have been retrieved, predicts the answer, and returns it to the user. The ViReader system achieves new state-of-the-art performances, with values of 70.83% EM (exact match) and 89.54% F1, outperforming the BERT-based system by 11.55% and 9.54% , respectively. It also obtains state-of-the-art performance on UIT-ViNewsQA (another Vietnamese dataset consisting of online health-domain news) and BiPaR (a bilingual dataset on English and Chinese novel texts). Compared with the BERT-based system, our system achieves significant improvements (in terms of F1) with 7.65% for English and 6.13% for Chinese on the BiPaR dataset. Furthermore, we build a ViReader application programming interface that programmers can employ in Artificial Intelligence applications.<\/jats:p>","DOI":"10.3233\/jifs-210683","type":"journal-article","created":{"date-parts":[[2021,7,7]],"date-time":"2021-07-07T05:25:35Z","timestamp":1625635535000},"page":"1993-2011","source":"Crossref","is-referenced-by-count":4,"title":["ViReader: A Wikipedia-based Vietnamese reading comprehension system using transfer learning"],"prefix":"10.1177","volume":"41","author":[{"given":"Kiet","family":"Van Nguyen","sequence":"first","affiliation":[{"name":"University of Information Technology, Ho Chi Minh City, Vietnam"},{"name":"VietnamNational University, Ho Chi Minh City, Vietnam"}]},{"given":"Nhat","family":"Duy Nguyen","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh City, Vietnam"},{"name":"VietnamNational University, Ho Chi Minh City, Vietnam"}]},{"given":"Phong Nguyen-Thuan","family":"Do","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh City, Vietnam"},{"name":"VietnamNational University, Ho Chi Minh City, Vietnam"}]},{"given":"Anh","family":"Gia-Tuan Nguyen","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh City, Vietnam"},{"name":"VietnamNational University, Ho Chi Minh City, Vietnam"}]},{"given":"Ngan Luu-Thuy","family":"Nguyen","sequence":"additional","affiliation":[{"name":"University of Information Technology, Ho Chi Minh City, Vietnam"},{"name":"VietnamNational University, Ho Chi Minh City, Vietnam"}]}],"member":"179","reference":[{"issue":"5","key":"10.3233\/JIFS-210683_ref2","doi-asserted-by":"crossref","first-page":"683","DOI":"10.1016\/j.ipm.2014.04.007","article-title":"Open domain question answering using Wikipedia-based knowledge model","volume":"50","author":"Ryu","year":"2014","journal-title":"Information Processing & Management"},{"issue":"1","key":"10.3233\/JIFS-210683_ref4","doi-asserted-by":"crossref","first-page":"102431","DOI":"10.1016\/j.ipm.2020.102431","article-title":"WabiQA: A Wikipedia-Based Thai Question-Answering System","volume":"58","author":"Noraset","year":"2021","journal-title":"Information Processing & Management"},{"issue":"1","key":"10.3233\/JIFS-210683_ref7","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1609\/aimag.v37i1.2636","article-title":"My Computer Is An Honor Student\u2014But How Intelligent Is It? Standardized tests as ameasure of AI","volume":"37","author":"Clark","year":"2016","journal-title":"AI Magazine"},{"issue":"1-2","key":"10.3233\/JIFS-210683_ref22","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1177\/0022057409189001-208","article-title":"Effective Practices for Developing Reading Comprehension","volume":"189","author":"Duke","year":"2009","journal-title":"Journal of Education"},{"issue":"3","key":"10.3233\/JIFS-210683_ref23","doi-asserted-by":"crossref","first-page":"134","DOI":"10.11648\/j.ijll.20140203.11","article-title":"The Effect of Summarizing Strategy on Reading Comprehension of IranianIntermediate EFL Learners","volume":"2","author":"Khoshsima","year":"2014","journal-title":"International Journal of Language and Linguistics"},{"issue":"2020","key":"10.3233\/JIFS-210683_ref24","first-page":"201404","article-title":"Enhancing Lexical-Based Approach withExternal Knowledge for Vietnamese Multiple-Choice Machine Reading Comprehension","volume":"8","author":"Van Nguyen","journal-title":"IEEE Access"},{"issue":"1","key":"10.3233\/JIFS-210683_ref26","first-page":"205","article-title":"A survey automatic text summarization","volume":"5","author":"Tas","year":"2007","journal-title":"Press Academia Procedia"},{"key":"10.3233\/JIFS-210683_ref29","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1162\/tacl_a_00087","article-title":"A Joint Model for Answer Sentence Ranking and Answer Extraction","volume":"4","author":"Sultan","year":"2016","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"10.3233\/JIFS-210683_ref35","doi-asserted-by":"crossref","unstructured":"Rajaraman A. and Ullman J.D. , Mining of Massive Datasets, Cambridge University Press (2011).","DOI":"10.1017\/CBO9781139058452"},{"issue":"2","key":"10.3233\/JIFS-210683_ref36","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1007\/s007999900025","article-title":"A Probabilistic Justification for Using TFxIDF Term Weighting in Information Retrieval","volume":"3","author":"Hiemstra","year":"2000","journal-title":"International Journal on Digital Libraries"},{"issue":"4","key":"10.3233\/JIFS-210683_ref37","doi-asserted-by":"crossref","first-page":"285","DOI":"10.21512\/comtech.v7i4.3746","article-title":"Single Document Automatic Text Summarization Using TermFrequency-Inverse Document Frequency (TF-IDF)","volume":"7","author":"Christian","year":"2016","journal-title":"ComTech: Computer, Mathematics and Engineering Applications"},{"key":"10.3233\/JIFS-210683_ref39","first-page":"109","article-title":"Okapi at TREC-3","volume":"109","author":"Robertson","year":"1995","journal-title":"NIST SpecialPublication Sp"},{"key":"10.3233\/JIFS-210683_ref40","first-page":"253","article-title":"Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC andInteractive Track","volume":"500","author":"Robertson","year":"1999","journal-title":"NIST Special Publication SP"},{"issue":"4","key":"10.3233\/JIFS-210683_ref43","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1007\/s12652-012-0143-x","article-title":"TSGVi: A Graph-Based Summarization System for Vietnamese Documents","volume":"3","author":"Nguyen-Hoang","year":"2012","journal-title":"Journal of Ambient Intelligence and Humanized Computing"},{"key":"10.3233\/JIFS-210683_ref50","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.knosys.2016.07.013","article-title":"News Reader: Using Knowledge Resources in a Cross-Lingual Reading Machine to GenerateMore knowledge from Massive Streams of News","volume":"110","author":"Vossen","year":"2016","journal-title":"Knowledge-Based Systems"},{"issue":"3","key":"10.3233\/JIFS-210683_ref51","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1016\/j.ipm.2018.12.003","article-title":"Predicate Constraints Based Question Answering Over Knowledge Graph","volume":"56","author":"Shin","year":"2019","journal-title":"Information Processing & Management"},{"key":"10.3233\/JIFS-210683_ref60","doi-asserted-by":"crossref","first-page":"104842","DOI":"10.1016\/j.knosys.2019.07.013","article-title":"Learning Short-Text Semantic Similarity with Word Embeddings and ExternalKnowledge Sources","volume":"182","author":"Nguyen","year":"2019","journal-title":"Knowledge-Based Systems"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JIFS-210683","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T03:16:27Z","timestamp":1770002187000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JIFS-210683"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,11]]},"references-count":17,"journal-issue":{"issue":"1"},"URL":"https:\/\/doi.org\/10.3233\/jifs-210683","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,11]]}}}