{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T09:23:36Z","timestamp":1780392216981,"version":"3.54.1"},"reference-count":70,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T00:00:00Z","timestamp":1740096000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T00:00:00Z","timestamp":1740096000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006041","name":"Innovate UK","doi-asserted-by":"publisher","award":["10039055"],"award-info":[{"award-number":["10039055"]}],"id":[{"id":"10.13039\/501100006041","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006041","name":"Innovate UK","doi-asserted-by":"publisher","award":["10039055"],"award-info":[{"award-number":["10039055"]}],"id":[{"id":"10.13039\/501100006041","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006041","name":"Innovate UK","doi-asserted-by":"publisher","award":["10039055"],"award-info":[{"award-number":["10039055"]}],"id":[{"id":"10.13039\/501100006041","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006041","name":"Innovate UK","doi-asserted-by":"publisher","award":["10039055"],"award-info":[{"award-number":["10039055"]}],"id":[{"id":"10.13039\/501100006041","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018693","name":"HORIZON EUROPE Framework Programme","doi-asserted-by":"publisher","award":["10107009"],"award-info":[{"award-number":["10107009"]}],"id":[{"id":"10.13039\/100018693","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018693","name":"HORIZON EUROPE Framework Programme","doi-asserted-by":"publisher","award":["10107009"],"award-info":[{"award-number":["10107009"]}],"id":[{"id":"10.13039\/100018693","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018693","name":"HORIZON EUROPE Framework Programme","doi-asserted-by":"publisher","award":["10107009"],"award-info":[{"award-number":["10107009"]}],"id":[{"id":"10.13039\/100018693","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018693","name":"HORIZON EUROPE Framework Programme","doi-asserted-by":"publisher","award":["10107009"],"award-info":[{"award-number":["10107009"]}],"id":[{"id":"10.13039\/100018693","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["EPJ Data Sci."],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Credibility signals represent a wide range of heuristics typically used by journalists and fact-checkers to assess the veracity of online content. Automating the extraction of credibility signals presents significant challenges due to the necessity of training high-accuracy, signal-specific extractors, coupled with the lack of sufficiently large annotated datasets. This paper introduces\n                    <jats:sc>Pastel<\/jats:sc>\n                    (\n                    <jats:bold>P<\/jats:bold>\n                    rompted we\n                    <jats:bold>A<\/jats:bold>\n                    k\n                    <jats:bold>S<\/jats:bold>\n                    upervision wi\n                    <jats:bold>T<\/jats:bold>\n                    h cr\n                    <jats:bold>E<\/jats:bold>\n                    dibility signa\n                    <jats:bold>L<\/jats:bold>\n                    s), a weakly supervised approach that leverages large language models (LLMs) to extract credibility signals from web content, and subsequently combines them to predict the veracity of content without relying on human supervision. We validate our approach using four article-level misinformation detection datasets, demonstrating that\n                    <jats:sc>Pastel<\/jats:sc>\n                    outperforms zero-shot veracity detection by 38.3% and achieves 86.7% of the performance of the state-of-the-art system trained with human supervision. Moreover, in cross-domain settings where training and testing datasets originate from different domains,\n                    <jats:sc>Pastel<\/jats:sc>\n                    significantly outperforms the state-of-the-art supervised model by 63%. We further study the association between credibility signals and veracity, and perform an ablation study showing the impact of each signal on model performance. Our findings reveal that 12 out of the 19 proposed signals exhibit strong associations with veracity across all datasets, while some signals show domain-specific strengths.\n                  <\/jats:p>","DOI":"10.1140\/epjds\/s13688-025-00534-0","type":"journal-article","created":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:55:00Z","timestamp":1740135300000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Weakly supervised veracity classification with LLM-predicted credibility signals"],"prefix":"10.1140","volume":"14","author":[{"given":"Jo\u00e3o A.","family":"Leite","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Olesya","family":"Razuvayevskaya","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kalina","family":"Bontcheva","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Carolina","family":"Scarton","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,2,21]]},"reference":[{"key":"534_CR1","doi-asserted-by":"publisher","unstructured":"Zhou X, Zafarani R (2020) A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput Surv 53(5). https:\/\/doi.org\/10.1145\/3395046","DOI":"10.1145\/3395046"},{"issue":"1","key":"534_CR2","doi-asserted-by":"publisher","DOI":"10.3390\/app13010592","volume":"13","author":"C Fu","year":"2023","unstructured":"Fu C, Pan X, Liang X, Yu S, Xu X, Min Y (2023) Feature drift in fake news detection: an interpretable analysis. Appl Sci 13(1):592","journal-title":"Appl Sci"},{"key":"534_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/IJCNN48605.2020.9207498","volume-title":"2020 International Joint Conference on Neural Networks (IJCNN)","author":"P Ksieniewicz","year":"2020","unstructured":"Ksieniewicz P, Zyblewski P, Chora\u015b M, Kozik R, Gie\u0142czyk A, Wo\u017aniak M (2020) Fake news detection from data streams. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp\u00a01\u20138. https:\/\/doi.org\/10.1109\/IJCNN48605.2020.9207498"},{"key":"534_CR4","doi-asserted-by":"publisher","first-page":"121","DOI":"10.5753\/kdmile.2021.17469","volume-title":"Anais do IX symposium on knowledge discovery, mining and learning","author":"R Silva","year":"2021","unstructured":"Silva R, Almeida T (2021) How concept drift can impair the classification of fake news. In: Anais do IX symposium on knowledge discovery, mining and learning. SBC, Porto Alegre, pp\u00a0121\u2013128. https:\/\/doi.org\/10.5753\/kdmile.2021.17469. https:\/\/sol.sbc.org.br\/index.php\/kdmile\/article\/view\/17469"},{"key":"534_CR5","first-page":"3391","volume-title":"Proceedings of the 27th international conference on computational linguistics","author":"V P\u00e9rez-Rosas","year":"2018","unstructured":"P\u00e9rez-Rosas V, Kleinberg B, Lefevre A, Mihalcea R (2018) Automatic detection of fake news. In: Bender EM, Derczynski L, Isabelle P (eds) Proceedings of the 27th international conference on computational linguistics. Association for Computational Linguistics, Santa Fe, pp\u00a03391\u20133401. https:\/\/aclanthology.org\/C18-1287"},{"key":"534_CR6","doi-asserted-by":"publisher","first-page":"1230","DOI":"10.1109\/ICCMC51019.2021.9418411","volume-title":"2021 5th International Conference on Computing Methodologies and Communication (ICCMC)","author":"P Goel","year":"2021","unstructured":"Goel P, Singhal S, Aggarwal S, Jain M (2021) Multi domain fake news analysis using transfer learning. In: 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pp\u00a01230\u20131237. https:\/\/doi.org\/10.1109\/ICCMC51019.2021.9418411"},{"key":"534_CR7","series-title":"Proceedings, part III","doi-asserted-by":"publisher","first-page":"650","DOI":"10.1007\/978-3-030-67664-3_39","volume-title":"Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2020","author":"K Shu","year":"2020","unstructured":"Shu K, Zheng G, Li Y, Mukherjee S, Awadallah AH, Ruston S, Liu H (2020) Early detection of fake news with multi-source weak social supervision. In: Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2020, Ghent, Belgium, September 14\u201318, 2020. Proceedings, part III. Springer, Berlin, pp\u00a0650\u2013666. https:\/\/doi.org\/10.1007\/978-3-030-67664-3_39"},{"key":"534_CR8","doi-asserted-by":"publisher","first-page":"274","DOI":"10.1109\/ASONAM.2018.8508520","volume-title":"2018 IEEE\/ACM international conference on Advances in Social Networks Analysis and Mining (ASONAM)","author":"S Helmstetter","year":"2018","unstructured":"Helmstetter S, Paulheim H (2018) Weakly supervised learning for fake news detection on Twitter. In: 2018 IEEE\/ACM international conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, Los Alamitos, pp\u00a0274\u2013277"},{"key":"534_CR9","doi-asserted-by":"publisher","first-page":"516","DOI":"10.1609\/aaai.v34i01.5389","volume-title":"Proceedings of the AAAI conference on artificial intelligence 34(01)","author":"Y Wang","year":"2020","unstructured":"Wang Y, Yang W, Ma F, Xu J, Zhong B, Deng Q, Gao J (2020) Weak supervision for fake news detection via reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence 34(01), pp\u00a0516\u2013523. https:\/\/doi.org\/10.1609\/aaai.v34i01.5389"},{"key":"534_CR10","unstructured":"W3C-CWCG (2024) W3C Credible Web Community Group. https:\/\/github.com\/w3c\/credweb. Accessed 2024-02-14"},{"key":"534_CR11","doi-asserted-by":"publisher","first-page":"3","DOI":"10.3233\/SSW55","volume-title":"Towards a knowledge-aware AI: SEMANTiCS 2022\u2014proceedings of the 18th international conference on semantic systems","author":"A Dimou","year":"2022","unstructured":"Dimou A, et al. (2022) Evaluating web content using the w3c credibility signals. In: Towards a knowledge-aware AI: SEMANTiCS 2022\u2014proceedings of the 18th international conference on semantic systems, Vienna, Austria, 13\u201315 September 2022, vol\u00a055. IOS Press, Amsterdam, p\u00a03"},{"key":"534_CR12","unstructured":"Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, Lacroix T, Rozi\u00e8re B, Goyal N, Hambro E, Azhar F, et al (2023) Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971"},{"key":"534_CR13","first-page":"1877","volume-title":"Advances in neural information processing systems","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H (eds) Advances in neural information processing systems, vol\u00a033. Curran Associates, Red Hook, pp\u00a01877\u20131901. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf"},{"key":"534_CR14","doi-asserted-by":"publisher","first-page":"2463","DOI":"10.18653\/v1\/D19-1250","volume-title":"Proceedings of the 2019 conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"F Petroni","year":"2019","unstructured":"Petroni F, Rockt\u00e4schel T, Riedel S, Lewis P, Bakhtin A, Wu Y, Miller A (2019) Language models as knowledge bases? In: Proceedings of the 2019 conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, pp\u00a02463\u20132473. https:\/\/doi.org\/10.18653\/v1\/D19-1250. https:\/\/aclanthology.org\/D19-1250"},{"key":"534_CR15","unstructured":"Leite J (2024) PASTEL Repository. https:\/\/github.com\/joaoaleite\/PASTEL. Accessed 2024-02-14"},{"key":"534_CR16","doi-asserted-by":"publisher","first-page":"18","DOI":"10.3115\/v1\/W14-2508","volume-title":"Proceedings of the ACL 2014 workshop on language technologies and computational social science","author":"A Vlachos","year":"2014","unstructured":"Vlachos A, Riedel S (2014) Fact checking: task definition and dataset construction. In: Proceedings of the ACL 2014 workshop on language technologies and computational social science, pp\u00a018\u201322"},{"key":"534_CR17","volume-title":"Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies","author":"W Ferreira","year":"2016","unstructured":"Ferreira W, Vlachos A (2016) Emergent: a novel data-set for stance classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. ACL"},{"key":"534_CR18","doi-asserted-by":"publisher","first-page":"422","DOI":"10.18653\/v1\/P17-2067","volume-title":"Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: short papers)","author":"WY Wang","year":"2017","unstructured":"Wang WY (2017) \u201cLiar, liar pants on fire\u201d: a new benchmark dataset for fake news detection. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, Vancouver, pp\u00a0422\u2013426. https:\/\/doi.org\/10.18653\/v1\/P17-2067. https:\/\/aclanthology.org\/P17-2067"},{"key":"534_CR19","series-title":"Chapter of the association for computational linguistics: human language technologies","doi-asserted-by":"publisher","first-page":"809","DOI":"10.18653\/v1\/N18-1074","volume-title":"Proceedings of the 2018 conference of the North American","author":"J Thorne","year":"2018","unstructured":"Thorne J, Vlachos A, Christodoulopoulos C, Mittal A (2018) FEVER: a large-scale dataset for fact extraction and VERification. In: Walker M, Ji H, Stent A (eds) Proceedings of the 2018 conference of the North American, Long Papers. Chapter of the association for computational linguistics: human language technologies, vol\u00a01. Association for Computational Linguistics, New Orleans, pp\u00a0809\u2013819. https:\/\/doi.org\/10.18653\/v1\/N18-1074. https:\/\/aclanthology.org\/N18-1074"},{"key":"534_CR20","doi-asserted-by":"publisher","first-page":"231","DOI":"10.18653\/v1\/P18-1022","volume-title":"Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers)","author":"M Potthast","year":"2018","unstructured":"Potthast M, Kiesel J, Reinartz K, Bevendorff J, Stein B (2018) A stylometric inquiry into hyperpartisan and fake news. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, pp\u00a0231\u2013240. https:\/\/doi.org\/10.18653\/v1\/P18-1022. https:\/\/aclanthology.org\/P18-1022"},{"key":"534_CR21","first-page":"531","volume-title":"Proceedings of the international AAAI conference on web and social media","author":"G Santia","year":"2018","unstructured":"Santia G, Williams J (2018) Buzzface: a news veracity dataset with Facebook user commentary and egos. In: Proceedings of the international AAAI conference on web and social media, vol\u00a012, pp\u00a0531\u2013540"},{"key":"534_CR22","first-page":"1","volume-title":"CEUR workshop proceedings","author":"E Tacchini","year":"2017","unstructured":"Tacchini E, Ballarin G, Della Vedova ML, Moret S, Alfaro L, et al. (2017) Some like it hoax: automated fake news detection in social networks. In: CEUR workshop proceedings, pp\u00a01\u201315. CEUR-WS"},{"issue":"3","key":"534_CR23","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0150989","volume":"11","author":"A Zubiaga","year":"2016","unstructured":"Zubiaga A, Liakata M, Procter R, Wong Sak Hoi G, Tolmie P (2016) Analysing how people orient to and spread rumours in social media by looking at conversational threads. PLoS ONE 11(3):0150989","journal-title":"PLoS ONE"},{"key":"534_CR24","first-page":"258","volume-title":"Proceedings of the international AAAI conference on web and social media","author":"T Mitra","year":"2015","unstructured":"Mitra T, Gilbert E (2015) Credbank: a large-scale social media corpus with associated credibility annotations. In: Proceedings of the international AAAI conference on web and social media, vol\u00a09, pp\u00a0258\u2013267"},{"key":"534_CR25","doi-asserted-by":"publisher","first-page":"849","DOI":"10.1145\/3219819.3219903","volume-title":"Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. KDD \u201918","author":"Y Wang","year":"2018","unstructured":"Wang Y, Ma F, Jin Z, Yuan Y, Xun G, Jha K, Su L, Gao J (2018) Eann: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. KDD \u201918. Association for Computing Machinery, New York, pp\u00a0849\u2013857. https:\/\/doi.org\/10.1145\/3219819.3219903"},{"key":"534_CR26","first-page":"6149","volume-title":"Proceedings of the 12th language resources and evaluation conference","author":"K Nakamura","year":"2020","unstructured":"Nakamura K, Levy S, Wang WY (2020) Fakeddit: a new multimodal benchmark dataset for fine-grained fake news detection. In: Proceedings of the 12th language resources and evaluation conference. European Language Resources Association, Marseille, pp\u00a06149\u20136157. https:\/\/aclanthology.org\/2020.lrec-1.755"},{"issue":"3","key":"534_CR27","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1089\/big.2020.0062","volume":"8","author":"K Shu","year":"2020","unstructured":"Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) Fakenewsnet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8(3):171\u2013188","journal-title":"Big Data"},{"key":"534_CR28","doi-asserted-by":"publisher","first-page":"4325","DOI":"10.1109\/BigData50022.2020.9378472","volume-title":"2020 IEEE International Conference on Big Data (Big Data)","author":"Y Li","year":"2020","unstructured":"Li Y, Jiang B, Shu K, Liu H (2020) Toward a multilingual and multimodal data repository for covid-19 disinformation. In: 2020 IEEE International Conference on Big Data (Big Data). IEEE, Los Alamitos, pp\u00a04325\u20134330"},{"key":"534_CR29","first-page":"2862","volume-title":"Proceedings of the 12th language resources and evaluation conference","author":"MZ Hossain","year":"2020","unstructured":"Hossain MZ, Rahman MA, Islam MS, Kar S (2020) BanFakeNews: a dataset for detecting fake news in Bangla. In: Proceedings of the 12th language resources and evaluation conference. European Language Resources Association, Marseille, pp\u00a02862\u20132871. https:\/\/aclanthology.org\/2020.lrec-1.349"},{"key":"534_CR30","first-page":"230","volume-title":"Proceedings of the 16th international conference on natural language processing","author":"T Saikh","year":"2019","unstructured":"Saikh T, De A, Ekbal A, Bhattacharyya P (2019) A deep learning approach for automatic detection of fake news. In: Proceedings of the 16th international conference on natural language processing. NLP Association of India, International Institute of Information Technology, Hyderabad, pp\u00a0230\u2013238. https:\/\/aclanthology.org\/2019.icon-1.27"},{"key":"534_CR31","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1109\/BigMM50055.2020.00033","volume-title":"2020 IEEE sixth international conference on multimedia big data (bigMM)","author":"A Gautam","year":"2020","unstructured":"Gautam A, Jerripothula KR (2020) Sgg: spinbot, grammarly and glove based fake news detection. In: 2020 IEEE sixth international conference on multimedia big data (bigMM). IEEE, Los Alamitos, pp\u00a0174\u2013182"},{"key":"534_CR32","first-page":"81","volume-title":"Proceedings of the AAAI conference on artificial intelligence","author":"Y Dun","year":"2021","unstructured":"Dun Y, Tu K, Chen C, Hou C, Yuan X (2021) Kan: knowledge-aware attention network for fake news detection. In: Proceedings of the AAAI conference on artificial intelligence, vol\u00a035, pp\u00a081\u201389"},{"key":"534_CR33","first-page":"3761","volume-title":"Proceedings of the thirteenth language resources and evaluation conference","author":"B Bhattarai","year":"2022","unstructured":"Bhattarai B, Granmo O-C, Jiao L (2022) ConvTextTM: an explainable convolutional Tsetlin machine framework for text classification. In: Proceedings of the thirteenth language resources and evaluation conference. European Language Resources Association, Marseille, pp\u00a03761\u20133770. https:\/\/aclanthology.org\/2022.lrec-1.401"},{"key":"534_CR34","first-page":"98","volume":"3","author":"N Rai","year":"2022","unstructured":"Rai N, Kumar D, Kaushik N, Raj C, Ali A (2022) Fake news classification using transformer based enhanced lstm and bert. Int J Cogn Comput Eng 3:98\u2013105","journal-title":"Int J Cogn Comput Eng"},{"key":"534_CR35","first-page":"759","volume-title":"Proceedings of the international AAAI conference on web and social media","author":"B Horne","year":"2017","unstructured":"Horne B, Adali S (2017) This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the international AAAI conference on web and social media, vol\u00a011, pp\u00a0759\u2013766"},{"key":"534_CR36","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1109\/SP.2012.34","volume-title":"2012 IEEE symposium on security and privacy","author":"S Afroz","year":"2012","unstructured":"Afroz S, Brennan M, Greenstadt R (2012) Detecting hoaxes, frauds, and deception in writing style online. In: 2012 IEEE symposium on security and privacy. IEEE, Los Alamitos, pp\u00a0461\u2013475"},{"key":"534_CR37","first-page":"2931","volume-title":"Proceedings of the 2017 conference on empirical methods in natural language processing","author":"H Rashkin","year":"2017","unstructured":"Rashkin H, Choi E, Jang JY, Volkova S, Choi Y (2017) Truth of varying shades: analyzing language in fake news and political fact-checking. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp\u00a02931\u20132937"},{"key":"534_CR38","first-page":"6992","volume-title":"Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)","author":"N Nikolaidis","year":"2024","unstructured":"Nikolaidis N, Piskorski J, Stefanovitch N (2024) Exploring the usability of persuasion techniques for downstream misinformation-related classification tasks. In: Calzolari N, Kan M-Y, Hoste V, Lenci A, Sakti S, Xue N (eds) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ELRA and ICCL, Torino, pp\u00a06992\u20137006. https:\/\/aclanthology.org\/2024.lrec-main.613"},{"key":"534_CR39","volume-title":"Workshop on \u201cAI for social good\u201d, NIPS 2018","author":"N O\u2019Brien","year":"2018","unstructured":"O\u2019Brien N, Latessa S, Evangelopoulos G, Boix X (2018) The language of fake news: opening the black-box of deep learning based detectors. In: Workshop on \u201cAI for social good\u201d, NIPS 2018, Montreal, Canada. http:\/\/hdl.handle.net\/1721.1\/120056"},{"key":"534_CR40","doi-asserted-by":"publisher","first-page":"877","DOI":"10.1145\/3331184.3331285","volume-title":"Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. SIGIR\u201919","author":"A Giachanou","year":"2019","unstructured":"Giachanou A, Rosso P, Crestani F (2019) Leveraging emotional signals for credibility detection. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. SIGIR\u201919. Association for Computing Machinery, New York, pp\u00a0877\u2013880. https:\/\/doi.org\/10.1145\/3331184.3331285"},{"key":"534_CR41","first-page":"79","volume-title":"Proceedings of the LREC 2022 workshop on natural language processing for political sciences","author":"E Dufraisse","year":"2022","unstructured":"Dufraisse E, Treuillier C, Brun A, Tourille J, Castagnos S, Popescu A (2022) Don\u2019t burst blindly: for a better use of natural language processing to fight opinion bubbles in news recommendations. In: Proceedings of the LREC 2022 workshop on natural language processing for political sciences. European Language Resources Association, Marseille, pp\u00a079\u201385. https:\/\/aclanthology.org\/2022.politicalnlp-1.11"},{"issue":"3","key":"534_CR42","doi-asserted-by":"publisher","first-page":"349","DOI":"10.1177\/09579265221076609","volume":"33","author":"E Musi","year":"2022","unstructured":"Musi E, Reed C (2022) From fallacies to semi-fake news: improving the identification of misinformation triggers across digital media. Discourse Soc 33(3):349\u2013370. https:\/\/doi.org\/10.1177\/09579265221076609","journal-title":"Discourse Soc"},{"key":"534_CR43","doi-asserted-by":"publisher","first-page":"163","DOI":"10.1007\/978-3-030-42699-6_9","volume-title":"Disinformation, misinformation, and fake news in social media: emerging research challenges and opportunities","author":"N Sitaula","year":"2020","unstructured":"Sitaula N, Mohan CK, Grygiel J, Zhou X, Zafarani R (2020) Credibility-based fake news detection. In: Disinformation, misinformation, and fake news in social media: emerging research challenges and opportunities, pp\u00a0163\u2013182"},{"key":"534_CR44","series-title":"International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1145\/3184558.3188731","volume-title":"Companion proceedings of the web conference 2018. WWW \u201918","author":"AX Zhang","year":"2018","unstructured":"Zhang AX, Ranganathan A, Metz SE, Appling S, Sehat CM, Gilmore N, Adams NB, Vincent E, Lee J, Robbins M, Bice E, Hawke S, Karger D, Mina AX (2018) A structured response to misinformation: defining and annotating credibility indicators in news articles. In: Companion proceedings of the web conference 2018. WWW \u201918. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, pp\u00a0603\u2013612. https:\/\/doi.org\/10.1145\/3184558.3188731"},{"key":"534_CR45","first-page":"3280","volume-title":"International conference on machine learning","author":"D Fu","year":"2020","unstructured":"Fu D, Chen M, Sala F, Hooper S, Fatahalian K, R\u00e9 C (2020) Fast and three-rious: speeding up weak supervision with triplet methods. In: International conference on machine learning, pp\u00a03280\u20133291. PMLR"},{"key":"534_CR46","unstructured":"Varma P, Sala F, Sagawa S, Fries J, Fu D, Khattar S, Ramamoorthy A, Xiao K, Fatahalian K, Priest J, et al (2019) Multi-resolution weak supervision for sequential data. Adv Neural Inf Process Syst 32"},{"key":"534_CR47","unstructured":"Ratner AJ, De Sa CM, Wu S, Selsam D, R\u00e9 C (2016) Data programming: Creating large training sets, quickly. Adv Neural Inf Process Syst 29"},{"key":"534_CR48","unstructured":"Smith R, Fries JA, Hancock B, Bach SH (2022) Language models in the loop: Incorporating prompting into weak supervision. arXiv preprint arXiv:2205.02318"},{"issue":"6","key":"534_CR49","volume":"3","author":"R Taori","year":"2023","unstructured":"Taori R, Gulrajani I, Zhang T, Dubois Y, Li X, Guestrin C, Liang P, Hashimoto TB (2023) Alpaca: a strong, replicable instruction-following model. Stanf Cent Res Found Model 3(6):7. https:\/\/crfm.stanford.edu\/2023\/03\/13\/alpaca.html","journal-title":"Stanf Cent Res Found Model"},{"key":"534_CR50","first-page":"269","volume-title":"Proceedings of the VLDB endowment. International conference on very large data bases","author":"A Ratner","year":"2017","unstructured":"Ratner A, Bach SH, Ehrenberg H, Fries J, Wu S, R\u00e9 C (2017) Snorkel: rapid training data creation with weak supervision. In: Proceedings of the VLDB endowment. International conference on very large data bases, vol\u00a011. NIH Public Access, p\u00a0269"},{"issue":"1","key":"534_CR51","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1145\/3137597.3137600","volume":"19","author":"K Shu","year":"2017","unstructured":"Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl 19(1):22\u201336","journal-title":"ACM SIGKDD Explor Newsl"},{"key":"534_CR52","doi-asserted-by":"publisher","first-page":"1230","DOI":"10.1109\/ICCMC51019.2021.9418411","volume-title":"2021 5th International Conference on Computing Methodologies and Communication (ICCMC)","author":"P Goel","year":"2021","unstructured":"Goel P, Singhal S, Aggarwal S, Jain M (2021) Multi domain fake news analysis using transfer learning. In: 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pp\u00a01230\u20131237. https:\/\/doi.org\/10.1109\/ICCMC51019.2021.9418411"},{"key":"534_CR53","unstructured":"Lee AN, Hunter CJ, Ruiz N (2023) Platypus: Quick, cheap, and powerful refinement of llms. arXiv preprint arXiv:2308.07317"},{"key":"534_CR54","unstructured":"Clark P, Cowhey I, Etzioni O, Khot T, Sabharwal A, Schoenick C, Tafjord O (2018) Think you have solved question answering? Try arc, the ai2 reasoning challenge. arXiv preprint arXiv:1803.05457"},{"key":"534_CR55","doi-asserted-by":"publisher","first-page":"4791","DOI":"10.18653\/v1\/P19-1472","volume-title":"Proceedings of the 57th annual meeting of the association for computational linguistics","author":"R Zellers","year":"2019","unstructured":"Zellers R, Holtzman A, Bisk Y, Farhadi A, Choi Y (2019) HellaSwag: can a machine really finish your sentence? In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, pp\u00a04791\u20134800. https:\/\/doi.org\/10.18653\/v1\/P19-1472. https:\/\/aclanthology.org\/P19-1472"},{"key":"534_CR56","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR)","author":"D Hendrycks","year":"2021","unstructured":"Hendrycks D, Burns C, Basart S, Zou A, Mazeika M, Song D, Steinhardt J (2021) Measuring massive multitask language understanding. In: Proceedings of the International Conference on Learning Representations (ICLR)"},{"key":"534_CR57","doi-asserted-by":"publisher","first-page":"3214","DOI":"10.18653\/v1\/2022.acl-long.229","volume-title":"Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: long papers)","author":"S Lin","year":"2022","unstructured":"Lin S, Hilton J, Evans O (2022) TruthfulQA: measuring how models mimic human falsehoods. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, pp\u00a03214\u20133252. https:\/\/doi.org\/10.18653\/v1\/2022.acl-long.229. https:\/\/aclanthology.org\/2022.acl-long.229"},{"key":"534_CR58","first-page":"24824","volume-title":"Advances in neural information processing systems","author":"J Wei","year":"2022","unstructured":"Wei J, Wang X, Schuurmans D, Bosma M, Ichter B, Xia F, Chi E, Le QV, Zhou D (2022) Chain-of-thought prompting elicits reasoning in large language models. In: Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A (eds) Advances in neural information processing systems, vol\u00a035. Curran Associates, Red Hook, pp\u00a024824\u201324837. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/file\/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf"},{"key":"534_CR59","volume-title":"International conference on learning representations","author":"EJ Hu","year":"2022","unstructured":"Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W (2022) LoRA: low-rank adaptation of large language models. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=nZeVKeeFYf9"},{"key":"534_CR60","unstructured":"HuggingFace Trainer Documentation. https:\/\/huggingface.co\/docs\/transformers\/main_classes\/trainer. Accessed 2024-02-14"},{"key":"534_CR61","unstructured":"Dettmers T, Zettlemoyer L (2023) The case for 4-bit precision: k-bit inference scaling laws. In: Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J (eds) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol\u00a0202. PMLR, pp\u00a07750\u20137774. https:\/\/proceedings.mlr.press\/v202\/dettmers23a.html"},{"key":"534_CR62","first-page":"30016","volume-title":"Advances in neural information processing systems","author":"J Hoffmann","year":"2022","unstructured":"Hoffmann J, Borgeaud S, Mensch A, Buchatskaya E, Cai T, Rutherford E, Las Casas D, Hendricks LA, Welbl J, Clark A, Hennigan T, Noland E, Millican K, Driessche G, Damoc B, Guy A, Osindero S, Simonyan K, Elsen E, Vinyals O, Rae J, Sifre L (2022) An empirical analysis of compute-optimal large language model training. In: Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A (eds) Advances in neural information processing systems, vol\u00a035. Curran Associates, Red Hook, pp\u00a030016\u201330030. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/file\/c1e2faff6f588870935f114ebe04a3e5-Paper-Conference.pdf"},{"key":"534_CR63","doi-asserted-by":"crossref","unstructured":"Hu B, Sheng Q, Cao J, Shi Y, Li Y, Wang D, Qi P (2023) Bad actor, good advisor: Exploring the role of large language models in fake news detection. arXiv preprint arXiv:2309.12247","DOI":"10.1609\/aaai.v38i20.30214"},{"key":"534_CR64","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1007\/s10994-009-5152-4","volume":"79","author":"S Ben-David","year":"2010","unstructured":"Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Mach Learn 79:151\u2013175","journal-title":"Mach Learn"},{"issue":"2","key":"534_CR65","doi-asserted-by":"publisher","DOI":"10.2196\/19273","volume":"6","author":"E Chen","year":"2020","unstructured":"Chen E, Lerman K, Ferrara E (2020) Tracking social media discourse about the covid-19 pandemic: development of a public coronavirus Twitter data set. JMIR Public Health Surveill 6(2):19273. https:\/\/doi.org\/10.2196\/19273","journal-title":"JMIR Public Health Surveill"},{"key":"534_CR66","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.nlpcovid19-2.11","volume-title":"Proceedings of the 1st workshop on NLP for COVID-19 (part 2) at EMNLP 2020","author":"T Hossain","year":"2020","unstructured":"Hossain T, Logan IV RL, Ugarte A, Matsubara Y, Young S, Singh S (2020) COVIDLies: detecting COVID-19 misinformation on social media. In: Proceedings of the 1st workshop on NLP for COVID-19 (part 2) at EMNLP 2020. Association for Computational Linguistics, Online. https:\/\/aclanthology.org\/2020.nlpcovid19-2.11. https:\/\/doi.org\/10.18653\/v1\/2020.nlpcovid19-2.11"},{"key":"534_CR67","unstructured":"Cui L, Lee D (2020) Coaid: Covid-19 healthcare misinformation dataset. arXiv preprint arXiv:2006.00885"},{"key":"534_CR68","doi-asserted-by":"publisher","unstructured":"Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2023) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 55(9). https:\/\/doi.org\/10.1145\/3560815","DOI":"10.1145\/3560815"},{"key":"534_CR69","doi-asserted-by":"publisher","unstructured":"Nadeem M, Bethke A, Reddy S (2021) StereoSet: Measuring stereotypical bias in pretrained language models. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, pp\u00a05356\u20135371. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.416. https:\/\/aclanthology.org\/2021.acl-long.416","DOI":"10.18653\/v1\/2021.acl-long.416"},{"key":"534_CR70","doi-asserted-by":"publisher","first-page":"9291","DOI":"10.1145\/3581783.3612704","volume-title":"Proceedings of the 31st ACM international conference on multimedia. MM \u201923","author":"D Xu","year":"2023","unstructured":"Xu D, Fan S, Kankanhalli M (2023) Combating misinformation in the era of generative ai models. In: Proceedings of the 31st ACM international conference on multimedia. MM \u201923. Association for Computing Machinery, New York, pp\u00a09291\u20139298. https:\/\/doi.org\/10.1145\/3581783.3612704"}],"container-title":["EPJ Data Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1140\/epjds\/s13688-025-00534-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1140\/epjds\/s13688-025-00534-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1140\/epjds\/s13688-025-00534-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:55:21Z","timestamp":1740135321000},"score":1,"resource":{"primary":{"URL":"https:\/\/epjdatascience.springeropen.com\/articles\/10.1140\/epjds\/s13688-025-00534-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,21]]},"references-count":70,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["534"],"URL":"https:\/\/doi.org\/10.1140\/epjds\/s13688-025-00534-0","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-5389911\/v1","asserted-by":"object"}]},"ISSN":["2193-1127"],"issn-type":[{"value":"2193-1127","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,21]]},"assertion":[{"value":"4 November 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 February 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 February 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"LLMs are known to inherit biases from their training data [\n                      \n                      ], which can manifest in their interpretations and judgements regarding the presence or absence of credibility signals in textual content. These biases may lead to inaccuracies or disparities in signal detection, potentially favouring certain types of content or perspectives over others. Moreover, the deployment of LLM-based systems in real-world applications must navigate concerns around fairness, transparency, and accountability. Researchers and developers are therefore urged to mitigate biases through rigorous testing, data preprocessing, and continuous monitoring.Also, although efforts aimed at mitigating misinformation are crucial in combating its harmful effects, it is important to acknowledge that these efforts can inadvertently empower malicious actors [\n                      \n                      ]. By gaining insights into which credibility signals are more easily detected by LLMs, and which correlate more strongly with veracity, malicious users could potentially exploit this knowledge to enhance their misinformation tactics and circumvent automatic detection systems. Therefore, we strongly urge researchers to apply our methodology with caution and in accordance with best practice ethics protocols.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical considerations"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors provide their full consent for the publication of this manuscript in EPJ Data Science.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"16"}}