{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,14]],"date-time":"2026-07-14T17:50:40Z","timestamp":1784051440975,"version":"3.55.0"},"reference-count":67,"publisher":"Cambridge University Press (CUP)","issue":"4","license":[{"start":{"date-parts":[[2022,10,26]],"date-time":"2022-10-26T00:00:00Z","timestamp":1666742400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2023,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Fake news detection is an emerging topic that has attracted a lot of attention among researchers and in the industry. This paper focuses on fake news detection as a text classification problem: on the basis of five publicly available corpora with documents labeled as true or fake, the task was to automatically distinguish both classes without relying on fact-checking. The aim of our research was to test the feasibility of a universal model: one that produces satisfactory results on all data sets tested in our article. We attempted to do so by training a set of classification models on one collection and testing them on another. As it turned out, this resulted in a sharp performance degradation. Therefore, this paper focuses on finding the most effective approach to utilizing information in a transferable manner. We examined a variety of methods: feature selection, machine learning approaches to data set shift (instance re-weighting and projection-based), and deep learning approaches based on domain transfer. These methods were applied to various feature spaces: linguistic and psycholinguistic, embeddings obtained from the Universal Sentence Encoder, and GloVe embeddings. A detailed analysis showed that some combinations of these methods and selected feature spaces bring significant improvements. When using linguistic data, feature selection yielded the best overall mean improvement (across all train-test pairs) of 4%. Among the domain adaptation methods, the greatest improvement of 3% was achieved by subspace alignment.<\/jats:p>","DOI":"10.1017\/s1351324922000456","type":"journal-article","created":{"date-parts":[[2022,10,26]],"date-time":"2022-10-26T07:58:04Z","timestamp":1666771084000},"page":"1004-1042","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":3,"title":["Towards universal methods for fake news detection"],"prefix":"10.1017","volume":"29","author":[{"given":"Maria","family":"Pszona","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Maria","family":"Janicka","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Grzegorz","family":"Wojdyga","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7081-9797","authenticated-orcid":false,"given":"Aleksander","family":"Wawer","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"56","published-online":{"date-parts":[[2022,10,26]]},"reference":[{"key":"S1351324922000456_ref53","first-page":"443","author":"Sun","year":"2016"},{"key":"S1351324922000456_ref61","doi-asserted-by":"crossref","unstructured":"Wawer, A. , Wojdyga, G. and Sarzy\u0144ska-Wawer, J. (2019). Fact checking or psycholinguistics: How to distinguish fake and true claims? In Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), Hong Kong, China. Association for Computational Linguistics, pp. 7\u201312.","DOI":"10.18653\/v1\/D19-6602"},{"key":"S1351324922000456_ref10","first-page":"1","article-title":"Statistical comparisons of classifiers over multiple data sets","volume":"7","author":"Dem\u0161ar","year":"2006","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324922000456_ref19","doi-asserted-by":"crossref","unstructured":"Horne, B. and Adali, S. (2017). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In Proceedings of the International AAAI Conference on Web and Social Media, pp. 759\u2013766.","DOI":"10.1609\/icwsm.v11i1.14976"},{"key":"S1351324922000456_ref47","doi-asserted-by":"publisher","DOI":"10.1089\/big.2020.0062"},{"key":"S1351324922000456_ref15","unstructured":"Fu, L. , Nguyen, T.H. , Min, B. and Grishman, R. (2017). Domain adaptation for relation extraction with domain adversarial neural network. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Taipei, Taiwan. Asian Federation of Natural Language Processing, pp. 425\u2013429."},{"key":"S1351324922000456_ref12","first-page":"301","author":"Elsayed","year":"2019"},{"key":"S1351324922000456_ref56","doi-asserted-by":"crossref","unstructured":"Thorne, J. , Vlachos, A. , Cocarascu, O. , Christodoulopoulos, C. and Mittal, A. (eds) (2019). Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), Hong Kong, China. Association for Computational Linguistics.","DOI":"10.18653\/v1\/W18-5501"},{"key":"S1351324922000456_ref7","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/37.3-4.256"},{"key":"S1351324922000456_ref18","doi-asserted-by":"publisher","DOI":"10.1177\/002194366900600202"},{"key":"S1351324922000456_ref25","unstructured":"Leippold, M. and Diggelmann, T. (2020). Climate-fever: A dataset for verification of real-world climate claims. In NeurIPS 2020 Workshop on Tackling Climate Change with Machine Learning."},{"key":"S1351324922000456_ref2","doi-asserted-by":"publisher","DOI":"10.1257\/jep.31.2.211"},{"key":"S1351324922000456_ref21","unstructured":"Jiang, J. and Zhai, C. (2007). Instance weighting for domain adaptation in NLP. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic. Association for Computational Linguistics, pp. 264\u2013271."},{"key":"S1351324922000456_ref57","unstructured":"Tzeng, E. , Hoffman, J. , Zhang, N. , Saenko, K. and Darrell, T. (2014). Deep domain confusion: Maximizing for domain invariance. CoRR, abs\/1412.3474."},{"key":"S1351324922000456_ref41","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i01.5386"},{"key":"S1351324922000456_ref28","first-page":"372","author":"Nakov","year":"2018"},{"key":"S1351324922000456_ref24","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.69.066138"},{"key":"S1351324922000456_ref32","doi-asserted-by":"publisher","DOI":"10.3115\/1218955.1218990"},{"key":"S1351324922000456_ref16","unstructured":"Ganin, Y. and Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning - Volume 37, ICML\u201915, pp. 1180\u20131189. JMLR.org."},{"key":"S1351324922000456_ref59","doi-asserted-by":"crossref","unstructured":"Wadden, D. , Lin, S. , Lo, K. , Wang, L.L. , van Zuylen, M. , Cohan, A. and Hajishirzi, H. (2020). Fact or fiction: Verifying scientific claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online. Association for Computational Linguistics, pp. 7534\u20137550.","DOI":"10.18653\/v1\/2020.emnlp-main.609"},{"key":"S1351324922000456_ref13","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.368"},{"key":"S1351324922000456_ref3","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1475"},{"key":"S1351324922000456_ref43","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1317"},{"key":"S1351324922000456_ref37","unstructured":"P\u00e9rez-Rosas, V. , Kleinberg, B. , Lefevre, A. and Mihalcea, R. (2018). Automatic detection of fake news. In Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA. Association for Computational Linguistics, pp. 3391\u20133401."},{"key":"S1351324922000456_ref55","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1074"},{"key":"S1351324922000456_ref40","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1022"},{"key":"S1351324922000456_ref44","unstructured":"Saito, K. , Ushiku, Y. , Harada, T. and Saenko, K. (2018). Adversarial dropout regularization. In International Conference on Learning Representations."},{"key":"S1351324922000456_ref60","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-2067"},{"key":"S1351324922000456_ref4","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58219-7_17"},{"key":"S1351324922000456_ref27","doi-asserted-by":"crossref","unstructured":"Liu, P. , Qiu, X. and Huang, X. (2017). Adversarial multi-task learning for text classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada. Association for Computational Linguistics, pp. 1\u201310.","DOI":"10.18653\/v1\/P17-1001"},{"key":"S1351324922000456_ref5","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-2029"},{"key":"S1351324922000456_ref9","doi-asserted-by":"publisher","DOI":"10.1002\/pra2.2015.145052010082"},{"key":"S1351324922000456_ref1","first-page":"127","author":"Ahmed","year":"2017"},{"key":"S1351324922000456_ref23","unstructured":"Kincaid, J. , Fishburne, R. , Rogers, R. and Chissom, B. (1975). Research branch report 8\u201375. Memphis: Naval Air Station."},{"key":"S1351324922000456_ref64","first-page":"1485","article-title":"Robustness and regularization of support vector machines","volume":"10","author":"Xu","year":"2009","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324922000456_ref38","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"S1351324922000456_ref36","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"S1351324922000456_ref22","doi-asserted-by":"publisher","DOI":"10.1016\/j.mlwa.2021.100032"},{"key":"S1351324922000456_ref17","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2012.6247911"},{"key":"S1351324922000456_ref45","unstructured":"Santos, R. , Pedro, G. , Leal, S. , Vale, O. , Pardo, T. , Bontcheva, K. and Scarton, C. (2020). Measuring the impact of readability features in fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France. European Language Resources Association, pp. 1404\u20131413."},{"key":"S1351324922000456_ref51","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-6616"},{"key":"S1351324922000456_ref52","volume-title":"The General Inquirer: A Computer Approach to Content Analysis","author":"Stone","year":"1966"},{"key":"S1351324922000456_ref54","first-page":"309","author":"Tchechmedjiev","year":"2019"},{"key":"S1351324922000456_ref49","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i1.16134"},{"key":"S1351324922000456_ref42","volume-title":"Dataset Shift in Machine Learning","author":"Quionero-Candela","year":"2009"},{"key":"S1351324922000456_ref20","doi-asserted-by":"publisher","DOI":"10.13053\/cys-23-3-3281"},{"key":"S1351324922000456_ref34","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2005.159"},{"key":"S1351324922000456_ref67","doi-asserted-by":"publisher","DOI":"10.1145\/3377478"},{"key":"S1351324922000456_ref14","doi-asserted-by":"publisher","DOI":"10.1037\/h0057532"},{"key":"S1351324922000456_ref35","unstructured":"Pennebaker, J. , Boyd, R. , Jordan, K. and Blackburn, K. (2015). The development and psychometric properties of LIWC2015. Technical report, Austin, TX: University of Texas at Austin."},{"key":"S1351324922000456_ref31","doi-asserted-by":"publisher","DOI":"10.1177\/0146167203029005010"},{"key":"S1351324922000456_ref48","unstructured":"Shu, R. , Bui, H. , Narui, H. and Ermon, S. (2018). A DIRT-t approach to unsupervised domain adaptation. In International Conference on Learning Representations."},{"key":"S1351324922000456_ref50","first-page":"1","article-title":"Automated readability index","author":"Smith","year":"1967","journal-title":"AMRL-TR. Aerospace Medical Research Laboratories"},{"key":"S1351324922000456_ref11","unstructured":"Devlin, J. , Chang, M.-W. , Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota. Association for Computational Linguistics, pp. 4171\u20134186."},{"key":"S1351324922000456_ref58","doi-asserted-by":"publisher","DOI":"10.1126\/science.aap9559"},{"key":"S1351324922000456_ref62","doi-asserted-by":"publisher","DOI":"10.2307\/3001968"},{"key":"S1351324922000456_ref65","unstructured":"Yang, Y. , Zheng, L. , Zhang, J. , Cui, Q. , Li, Z. and Yu, P.S. (2018). Ti-cnn: Convolutional neural networks for fake news detection. arXiv preprint arXiv:1806.00749."},{"key":"S1351324922000456_ref6","unstructured":"Chu, C. and Wang, R. (2018). A survey of domain adaptation for neural machine translation. In Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA. Association for Computational Linguistics, pp. 1304\u20131319."},{"key":"S1351324922000456_ref29","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/619"},{"key":"S1351324922000456_ref66","unstructured":"Zellers, R. , Holtzman, A. , Rashkin, H. , Bisk, Y. , Farhadi, A. , Roesner, F. and Choi, Y. (2019). Defending against neural fake news. In Wallach H., Larochelle H., Beygelzimer A., d\u2019Alch\u00e9-Buc F., Fox E. and Garnett, R. (eds), Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc."},{"key":"S1351324922000456_ref8","doi-asserted-by":"publisher","DOI":"10.1037\/h0076540"},{"key":"S1351324922000456_ref63","unstructured":"Wilson, G. and Cook, D.J. (2018). A survey of unsupervised deep domain adaptation. CoRR, abs\/1812.02849."},{"key":"S1351324922000456_ref26","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-5004"},{"key":"S1351324922000456_ref33","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324922000456_ref30","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-72240-1_75"},{"key":"S1351324922000456_ref46","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1341"},{"key":"S1351324922000456_ref39","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1003"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324922000456","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,19]],"date-time":"2023-07-19T08:59:41Z","timestamp":1689757181000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324922000456\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,26]]},"references-count":67,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,7]]}},"alternative-id":["S1351324922000456"],"URL":"https:\/\/doi.org\/10.1017\/s1351324922000456","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,26]]},"assertion":[{"value":"\u00a9 The Author(s), 2022. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}}]}}