{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T13:38:05Z","timestamp":1776173885414,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":31,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,6,18]],"date-time":"2024-06-18T00:00:00Z","timestamp":1718668800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Killam Postdoctoral Fellowship"},{"name":"Strategic Research Council of Research Council of Finland","award":["358471"],"award-info":[{"award-number":["358471"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,6,18]]},"DOI":"10.1145\/3661167.3661172","type":"proceedings-article","created":{"date-parts":[[2024,6,14]],"date-time":"2024-06-14T12:24:25Z","timestamp":1718367865000},"page":"262-271","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic Reviews"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5220-8730","authenticated-orcid":false,"given":"Aleksi","family":"Huotala","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Helsinki, Finland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3695-7280","authenticated-orcid":false,"given":"Miikka","family":"Kuutila","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science, Dalhousie University, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7411-0857","authenticated-orcid":false,"given":"Paul","family":"Ralph","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science, Dalhousie University, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2841-5879","authenticated-orcid":false,"given":"Mika","family":"M\u00e4ntyl\u00e4","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Helsinki, Finland and M3S, University of Oulu, Finland"}]}],"member":"320","published-online":{"date-parts":[[2024,6,18]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3442695"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.21105\/joss.00774"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","unstructured":"Francisco Bolanos Angelo Salatino Francesco Osborne and Enrico Motta. 2024. Artificial Intelligence for Literature Reviews: Opportunities and Challenges. https:\/\/doi.org\/10.48550\/arXiv.2402.08565 arXiv:2402.08565\u00a0[cs]","DOI":"10.48550\/arXiv.2402.08565"},{"key":"e_1_3_2_1_4_1","volume-title":"Language models are few-shot learners. Advances in neural information processing systems 33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared\u00a0D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877\u20131901."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13643-021-01635-3"},{"key":"e_1_3_2_1_6_1","volume-title":"What\u2019s so Simple about Simplified Texts? A Computational and Psycholinguistic Investigation of Text Comprehension and Text Processing.Reading in a Foreign Language 26, 1","author":"Crossley A","year":"2014","unstructured":"Scott\u00a0A Crossley, Hae\u00a0Sung Yang, and Danielle\u00a0S McNamara. 2014. What\u2019s so Simple about Simplified Texts? A Computational and Psycholinguistic Investigation of Text Comprehension and Text Processing.Reading in a Foreign Language 26, 1 (2014), 92\u2013113."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2016.09.002"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13643-018-0707-8"},{"key":"e_1_3_2_1_9_1","volume-title":"Developing a test of scientific literacy skills (TOSLS): Measuring undergraduates","author":"Gormally Cara","year":"2012","unstructured":"Cara Gormally, Peggy Brickman, and Mary Lutz. 2012. Developing a test of scientific literacy skills (TOSLS): Measuring undergraduates\u2019 evaluation of scientific information and arguments. CBE\u2014Life Sciences Education 11, 4 (2012), 364\u2013377."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.2196\/48996"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","unstructured":"Xinyi Hou Yanjie Zhao Yue Liu Zhou Yang Kailong Wang Li Li Xiapu Luo David Lo John Grundy and Haoyu Wang. 2023. Large Language Models for Software Engineering: A Systematic Literature Review. https:\/\/doi.org\/10.48550\/arXiv.2308.10620 arxiv:2308.10620\u00a0[cs]","DOI":"10.48550\/arXiv.2308.10620"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","unstructured":"Aleksi Huotala Miikka Kuutila Paul Ralph and Mika M\u00e4ntyl\u00e4. 2024. Dataset for paper: The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic Reviews. https:\/\/doi.org\/10.5281\/zenodo.11028876","DOI":"10.5281\/zenodo.11028876"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","unstructured":"Qusai Khraisha Sophie Put Johanna Kappenberg Azza Warraitch and Kristin Hadfield. 2024. Can Large Language Models Replace Humans in Systematic Reviews? Evaluating GPT-4\u2019s Efficacy in Screening and Extracting Data from Peer-Reviewed and Grey Literature in Multiple Languages. Research Synthesis Methods (March 2024). https:\/\/doi.org\/10.1002\/jrsm.1715","DOI":"10.1002\/jrsm.1715"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"crossref","unstructured":"J\u00a0Peter Kincaid Robert\u00a0P Fishburne\u00a0Jr Richard\u00a0L Rogers and Brad\u00a0S Chissom. 1975. Derivation of new readability formulas (automated readability index fog count and flesch reading ease formula) for navy enlisted personnel. Technical Report.","DOI":"10.21236\/ADA006655"},{"key":"e_1_3_2_1_15_1","unstructured":"Barbara Kitchenham and Stuart Charters. 2007. Guidelines for performing systematic literature reviews in software engineering. Technical Report EBSE-2007-01 (2007)."},{"key":"e_1_3_2_1_16_1","volume-title":"Advances in Neural Information Processing Systems, S.\u00a0Koyejo, S.\u00a0Mohamed, A.\u00a0Agarwal, D.\u00a0Belgrave, K.\u00a0Cho, and A.\u00a0Oh (Eds.). Vol.\u00a035. Curran Associates","author":"Kojima Takeshi","year":"2022","unstructured":"Takeshi Kojima, Shixiang\u00a0(Shane) Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large Language Models are Zero-Shot Reasoners. In Advances in Neural Information Processing Systems, S.\u00a0Koyejo, S.\u00a0Mohamed, A.\u00a0Agarwal, D.\u00a0Belgrave, K.\u00a0Cho, and A.\u00a0Oh (Eds.). Vol.\u00a035. Curran Associates, Inc., 22199\u201322213. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/file\/8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2020.106257"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13643-019-1074-9"},{"key":"e_1_3_2_1_19_1","first-page":"639","article-title":"SMOG grading-a new readability formula","volume":"12","author":"Mc\u00a0Laughlin G\u00a0Harry","year":"1969","unstructured":"G\u00a0Harry Mc\u00a0Laughlin. 1969. SMOG grading-a new readability formula. Journal of reading 12, 8 (1969), 639\u2013646.","journal-title":"Journal of reading"},{"key":"e_1_3_2_1_20_1","volume-title":"Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic reviews 4, 1","author":"O\u2019Mara-Eves Alison","year":"2015","unstructured":"Alison O\u2019Mara-Eves, James Thomas, John McNaught, Makoto Miwa, and Sophia Ananiadou. 2015. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic reviews 4, 1 (2015), 1\u201322."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3560877"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13643-015-0067-6"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","unstructured":"Ambrose Robinson William Thorne Ben\u00a0P. Wu Abdullah Pandor Munira Essat Mark Stevenson and Xingyi Song. 2023. Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation. https:\/\/doi.org\/10.48550\/arXiv.2308.06610 arxiv:2308.06610\u00a0[cs]","DOI":"10.48550\/arXiv.2308.06610"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/2818754.2818836"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.233"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2021.106589"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","unstructured":"Shuai Wang Harrisen Scells Shengyao Zhuang Martin Potthast Bevan Koopman and Guido Zuccon. 2024. Zero-Shot Generative Large Language Models for Systematic Review Screening Automation. https:\/\/doi.org\/10.48550\/arXiv.2401.06320 arxiv:2401.06320\u00a0[cs]","DOI":"10.48550\/arXiv.2401.06320"},{"key":"e_1_3_2_1_28_1","volume-title":"Chi, Quoc Le, and Denny Zhou","author":"Wei Jason","year":"2023","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arxiv:2201.11903\u00a0[cs]"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","unstructured":"David Wilkins. 2023. Automated Title and Abstract Screening for Scoping Reviews Using the GPT-4 Large Language Model. https:\/\/doi.org\/10.48550\/arxiv.2311.07918 arxiv:2311.07918\u00a0[cs]","DOI":"10.48550\/arxiv.2311.07918"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2022.106908"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","unstructured":"Wayne\u00a0Xin Zhao Kun Zhou Junyi Li Tianyi Tang Xiaolei Wang Yupeng Hou Yingqian Min Beichen Zhang Junjie Zhang Zican Dong Yifan Du Chen Yang Yushuo Chen Zhipeng Chen Jinhao Jiang Ruiyang Ren Yifan Li Xinyu Tang Zikang Liu Peiyu Liu Jian-Yun Nie and Ji-Rong Wen. 2023. A Survey of Large Language Models. https:\/\/doi.org\/10.48550\/arXiv.2303.18223 arxiv:2303.18223\u00a0[cs]","DOI":"10.48550\/arXiv.2303.18223"}],"event":{"name":"EASE 2024: 28th International Conference on Evaluation and Assessment in Software Engineering","location":"Salerno Italy","acronym":"EASE 2024"},"container-title":["Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3661167.3661172","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3661167.3661172","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T11:15:00Z","timestamp":1755861300000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3661167.3661172"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,18]]},"references-count":31,"alternative-id":["10.1145\/3661167.3661172","10.1145\/3661167"],"URL":"https:\/\/doi.org\/10.1145\/3661167.3661172","relation":{},"subject":[],"published":{"date-parts":[[2024,6,18]]},"assertion":[{"value":"2024-06-18","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}