{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,14]],"date-time":"2025-05-14T02:29:59Z","timestamp":1747189799696,"version":"3.40.5"},"reference-count":15,"publisher":"World Scientific Pub Co Pte Ltd","issue":"01","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Semantic Computing"],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:p> Recent works with the BERT-based models demonstrate their generalization ability and high performance on the new domain tasks. However, this kind of model requires a large amount of data. Collecting this data can be error-prone, and it is important to know: how the errors in data affect the quality of the model. In this work, we study the impact of data with different errors \u2013 noisy data on the training of the question answering-over-text BERT-model. We use the concept of random, structural and irrelevant question noises. We study the robustness of QAT models during the training process with different settings, datasets and noise types and discuss possible reasons. We also propose a real-world domain dataset to probe our findings in a real-world scenario. The results of an experimental study showed that following developed recommendations allowed performance improvement up to 3.6% in a real-world setting. <\/jats:p>","DOI":"10.1142\/s1793351x24410046","type":"journal-article","created":{"date-parts":[[2023,12,17]],"date-time":"2023-12-17T13:31:52Z","timestamp":1702819912000},"page":"77-96","source":"Crossref","is-referenced-by-count":1,"title":["Does Noise Really Matter? Investigation into the Influence of Noisy Labels on BERT-Based Question Answering System"],"prefix":"10.1142","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3494-5315","authenticated-orcid":false,"given":"Dmitriy","family":"Alexandrov","sequence":"first","affiliation":[{"name":"ITMO University, Saint Petersburg, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7624-6790","authenticated-orcid":false,"given":"Anastasiia","family":"Zakharova","sequence":"additional","affiliation":[{"name":"ITMO University, Saint Petersburg, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2705-1313","authenticated-orcid":false,"given":"Nikolay","family":"Butakov","sequence":"additional","affiliation":[{"name":"ITMO University, Saint Petersburg, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2024,1,30]]},"reference":[{"key":"S1793351X24410046BIB002","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1612"},{"key":"S1793351X24410046BIB003","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380060"},{"key":"S1793351X24410046BIB004","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264"},{"key":"S1793351X24410046BIB005","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00276"},{"key":"S1793351X24410046BIB006","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-2623"},{"key":"S1793351X24410046BIB007","doi-asserted-by":"publisher","DOI":"10.1109\/TR.2021.3070863"},{"key":"S1793351X24410046BIB008","doi-asserted-by":"publisher","DOI":"10.1109\/TR.2022.3156126"},{"key":"S1793351X24410046BIB012","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.190"},{"key":"S1793351X24410046BIB013","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.84"},{"key":"S1793351X24410046BIB014","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1606"},{"key":"S1793351X24410046BIB018","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1237"},{"key":"S1793351X24410046BIB019","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.342"},{"key":"S1793351X24410046BIB021","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/694"},{"volume-title":"Proc. 2019 Conf. Empirical Methods in Natural Language Processing","author":"Reimers N.","key":"S1793351X24410046BIB022"},{"volume-title":"Int. Conf. Learning Representations","year":"2020","author":"Zhang T.","key":"S1793351X24410046BIB023"}],"container-title":["International Journal of Semantic Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S1793351X24410046","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T01:25:47Z","timestamp":1711416347000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S1793351X24410046"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,30]]},"references-count":15,"journal-issue":{"issue":"01","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["10.1142\/S1793351X24410046"],"URL":"https:\/\/doi.org\/10.1142\/s1793351x24410046","relation":{},"ISSN":["1793-351X","1793-7108"],"issn-type":[{"type":"print","value":"1793-351X"},{"type":"electronic","value":"1793-7108"}],"subject":[],"published":{"date-parts":[[2024,1,30]]}}}