{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T01:11:05Z","timestamp":1770340265239,"version":"3.49.0"},"reference-count":45,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T00:00:00Z","timestamp":1763424000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Modern web applications change frequently in response to user and market needs, making their testing challenging. Manual testing and automation methods often struggle to keep up with these changes. We propose an automated testing framework, AutoQALLMs, that utilises various LLMs (Large Language Models), including GPT-4, Claude, and Grok, alongside Selenium WebDriver, BeautifulSoup, and regular expressions. This framework enables one-click testing, where users provide a URL as input and receive test results as output, thus eliminating the need for human intervention. It extracts HTML (Hypertext Markup Language) elements from the webpage and utilises the LLMs API to generate Selenium-based test scripts. Regular expressions enhance the clarity and maintainability of these scripts. The scripts are executed automatically, and the results, such as pass\/fail status and error details, are displayed to the tester. This streamlined input\u2013output process forms the core foundation of the AutoQALLMs framework. We evaluated the framework on 30 websites. The results show that the system drastically reduces the time needed to create test cases, achieves broad test coverage (96%) with Claude 4.5 LLM, which is competitive with manual scripts (98%), and allows for rapid regeneration of tests in response to changes in webpage structure. Software testing expert feedback confirmed that the proposed AutoQALLMs method for automated web application testing enables faster regression testing, reduces manual effort, and maintains reliable test execution. However, some limitations remain in handling complex page changes and validation. Although Claude 4.5 achieved slightly higher test coverage in the comparative evaluation of the proposed experiment, GPT-4 was selected as the default model for AutoQALLMs due to its cost-efficiency, reproducibility, and stable script generation across diverse websites. Future improvements may focus on increasing accuracy, adding self-healing techniques, and expanding to more complex testing scenarios.<\/jats:p>","DOI":"10.3390\/computers14110501","type":"journal-article","created":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T08:50:07Z","timestamp":1763542207000},"page":"501","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["AutoQALLMs: Automating Web Application Testing Using Large Language Models (LLMs) and Selenium"],"prefix":"10.3390","volume":"14","author":[{"given":"Sindhupriya","family":"Mallipeddi","sequence":"first","affiliation":[{"name":"Cybersecurity and Computing Systems Research Group, Department of Computer Science, University of Hertfordshire, Hatfield AL10 9AB, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9328-2593","authenticated-orcid":false,"given":"Muhammad","family":"Yaqoob","sequence":"additional","affiliation":[{"name":"Cybersecurity and Computing Systems Research Group, Department of Computer Science, University of Hertfordshire, Hatfield AL10 9AB, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3306-1195","authenticated-orcid":false,"given":"Javed Ali","family":"Khan","sequence":"additional","affiliation":[{"name":"Cybersecurity and Computing Systems Research Group, Department of Computer Science, University of Hertfordshire, Hatfield AL10 9AB, UK"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-0624-628X","authenticated-orcid":false,"given":"Tahir","family":"Mehmood","sequence":"additional","affiliation":[{"name":"School of Information Technology, UNITAR International University, Petaling Jaya 47301, Selangor, Malaysia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8819-5831","authenticated-orcid":false,"given":"Alexios","family":"Mylonas","sequence":"additional","affiliation":[{"name":"Cybersecurity and Computing Systems Research Group, Department of Computer Science, University of Hertfordshire, Hatfield AL10 9AB, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3392-9970","authenticated-orcid":false,"given":"Nikolaos","family":"Pitropakis","sequence":"additional","affiliation":[{"name":"Department of Information Technology, Cybersecurity and Computer Science, The American College of Greece, 15342 Athens, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1109\/TR.2019.2892517","article-title":"Machine learning applied to software testing: A systematic mapping study","volume":"68","author":"Durelli","year":"2019","journal-title":"IEEE Trans. Reliab."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1016\/j.jss.2014.01.010","article-title":"Web application testing: A systematic literature review","volume":"91","author":"Garousi","year":"2014","journal-title":"J. Syst. Softw."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Nguyen, D.P., and Maag, S. (2020, January 7\u20139). Codeless web testing using Selenium and machine learning. Proceedings of the ICSOFT 2020: 15th International Conference on Software Technologies, Online Event.","DOI":"10.5220\/0009885400510060"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Paul, N., and Tommy, R. (2018, January 11\u201312). An Approach of Automated Testing on Web Based Platform Using Machine Learning and Selenium. Proceedings of the 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India.","DOI":"10.1109\/ICIRCA.2018.8597297"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Briand, L.C. (2008, January 12\u201313). Novel applications of machine learning in software testing. Proceedings of the 2008 The Eighth International Conference on Quality Software, Oxford, UK.","DOI":"10.1109\/QSIC.2008.29"},{"key":"ref_6","unstructured":"Khaliq, Z., Farooq, S.U., and Khan, D.A. (2022). Artificial intelligence in software testing: Impact, problems, challenges and prospect. arXiv."},{"key":"ref_7","first-page":"5","article-title":"Artificial AI in Test Automation: Software Testing opportunities with Openai Technology-Chatgpt","volume":"62","author":"Talasbek","year":"2023","journal-title":"Suleyman Demirel Univ. Bull. Nat. Tech. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chen, F.K., Liu, C.H., and You, S.D. (2025). Using Large Language Model to Fill in Web Forms to Support Automated Web Application Testing. Information, 16.","DOI":"10.3390\/info16020102"},{"key":"ref_9","unstructured":"Li, T., Huang, R., Cui, C., Towey, D., Ma, L., Li, Y.F., and Xia, W. (2024). A Survey on Web Application Testing: A Decade of Evolution. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.37256\/aie.5120243220","article-title":"Software Test Case Generation Using Natural Language Processing (NLP): A Systematic Literature Review","volume":"5","author":"Ayenew","year":"2024","journal-title":"Artif. Intell. Evol."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Dawei, X., Liqiu, J., Xinpeng, X., and Yuhang, W. (2016, January 8\u201310). Web application automatic testing solution. Proceedings of the 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), Beijing, China.","DOI":"10.1109\/ICISCE.2016.254"},{"key":"ref_12","first-page":"28","article-title":"Codeless Test Automation for Development QA","volume":"91","author":"Gatla","year":"2023","journal-title":"Am. Sci. Res. J. Eng. Technol. Sci."},{"key":"ref_13","unstructured":"Jiang, J., Wang, F., Shen, J., Kim, S., and Kim, S. (2024). A survey on large language models for code generation. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Khan, J.A., Qayyum, S., and Dar, H.S. (2025). Large Language Model for Requirements Engineering: A Systematic Literature Review. Res. Sq.","DOI":"10.21203\/rs.3.rs-5589929\/v1"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1145\/3708522","article-title":"Large language model for vulnerability detection and repair: Literature review and the road ahead","volume":"34","author":"Zhou","year":"2025","journal-title":"ACM Trans. Softw. Eng. Methodol."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Leotta, M., Yousaf, H.Z., Ricca, F., and Garcia, B. (2024, January 18\u201321). AI-generated test scripts for web e2e testing with ChatGPT and copilot: A preliminary study. Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, Salerno, Italy.","DOI":"10.1145\/3661167.3661192"},{"key":"ref_17","first-page":"85","article-title":"An empirical evaluation of using large language models for automated unit test generation","volume":"50","author":"Nadi","year":"2023","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1109\/TSE.2024.3368208","article-title":"Software testing with large language models: Survey, landscape, and vision","volume":"50","author":"Wang","year":"2024","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_19","unstructured":"Deng, G., Liu, Y., Mayoral-Vilches, V., Liu, P., Li, Y., Xu, Y., Zhang, T., Liu, Y., Pinzger, M., and Rass, S. (2024, January 14\u201316). {PentestGPT}: Evaluating and harnessing large language models for automated penetration testing. Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Liu, Z., Chen, C., Wang, J., Chen, M., Wu, B., Che, X., Wang, D., and Wang, Q. (2024, January 14\u201320). Make llm a testing expert: Bringing human-like interaction to mobile gui testing via functionality-aware decisions. Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering, Lisbon, Portugal.","DOI":"10.1145\/3597503.3639180"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Job, M.A. (2021). Automating and optimizing software testing using artificial intelligence techniques. Int. J. Adv. Comput. Sci. Appl., 12.","DOI":"10.14569\/IJACSA.2021.0120571"},{"key":"ref_22","unstructured":"Wang, F., Kodur, K., Micheletti, M., Cheng, S.W., Sadasivam, Y., Hu, Y., and Li, Z. (2025, August 28). Large Language Model Driven Automated Software Application Testing. Technical Disclosure Commons. Available online: https:\/\/www.tdcommons.org\/dpubs_series\/6815."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Sherifi, B., Slhoub, K., and Nembhard, F. (2024). The Potential of LLMs in Automating Software Testing: From Generation to Reporting. arXiv.","DOI":"10.1007\/978-3-032-08649-5_13"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"106969","DOI":"10.1016\/j.infsof.2022.106969","article-title":"A deep learning-based automated framework for functional User Interface testing","volume":"150","author":"Khaliq","year":"2022","journal-title":"Inf. Softw. Technol."},{"key":"ref_25","unstructured":"Ale, N.K., and Yarram, R. (2024, January 17\u201318). Enhancing Test Automation with Deep Learning: Techniques, Challenges and Future Prospects. Proceedings of the CS & IT Conference Proceedings, 8th International Conference on Computer Science and Information Technology (COMIT 2024), Chennai, India."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Pei, K., Cao, Y., Yang, J., and Jana, S. (2017, January 28). Deepxplore: Automated whitebox testing of deep learning systems. Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China.","DOI":"10.1145\/3132747.3132785"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zimmermann, D., and Koziolek, A. (2023, January 11\u201315). Gui-based software testing: An automated approach using gpt-4 and selenium webdriver. Proceedings of the 2023 38th IEEE\/ACM International Conference on Automated Software Engineering Workshops (ASEW), Luxembourg.","DOI":"10.1109\/ASEW60602.2023.00028"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Cavalcanti, A.R., Accioly, L., Valen\u00e7a, G., Nogueira, S.C., Morais, A.C., Oliveira, A., and Gomes, S. (2025, January 9\u201312). Automating Test Design Using LLM: Results from an Empirical Study on the Public Sector. Proceedings of the Conference on Digital Government Research, Porto Alegre, Brazil.","DOI":"10.59490\/dgo.2025.1025"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"e1893","DOI":"10.1002\/stvr.1893","article-title":"Improving web element localization by using a large language model","volume":"34","author":"Nass","year":"2024","journal-title":"Softw. Testing Verif. Reliab."},{"key":"ref_30","unstructured":"Le, N.K., Bui, Q.M., Nguyen, M.N., Nguyen, H., Vo, T., Luu, S.T., Nomura, S., and Nguyen, M.L. (2025). Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, T., Cui, C., Huang, R., Towey, D., and Ma, L. (2024). Large Language Models for Automated Web-Form-Test Generation: An Empirical Study. arXiv.","DOI":"10.1145\/3735553"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wang, S., Wang, S., Fan, Y., Li, X., and Liu, Y. (2024, January 6\u201311). Leveraging large vision-language model for better automatic web GUI testing. Proceedings of the 2024 IEEE International Conference on Software Maintenance and Evolution (ICSME), Flagstaff, AZ, USA.","DOI":"10.1109\/ICSME58944.2024.00022"},{"key":"ref_33","unstructured":"Garousi, V., Joy, N., and Kele\u015f, A.B. (2024). AI-powered test automation tools: A systematic review and empirical evaluation. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1302","DOI":"10.21275\/SR231216065308","article-title":"AI-Based test automation for intelligent chatbot systems","volume":"12","author":"Khankhoje","year":"2023","journal-title":"Int. J. Sci. Res. (IJSR)"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Chapman, C., and Stolee, K.T. (2016, January 18\u201320). Exploring regular expression usage and context in Python. Proceedings of the 25th International Symposium on Software Testing and Analysis, Saarbr\u00fccken, Germany.","DOI":"10.1145\/2931037.2931073"},{"key":"ref_36","unstructured":"(2011). Systems and Software Engineering\u2014Systems and Software Quality Requirements and Evaluation (SQuaRE)\u2014System and Software Quality Models (Standard No. ISO\/IEC 25010:2011). Available online: https:\/\/www.iso.org\/standard\/35733.html."},{"key":"ref_37","unstructured":"Buse, R.L., and Weimer, W.R. (2008, January 9\u201314). Learning a metric for software readability. Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Atlanta, GA, USA."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Bondi, A.B. (2000, January 17\u201320). Characteristics of scalability and their impact on performance. Proceedings of the 2nd International Workshop on Software and Performance, Ottawa, ON, Canada.","DOI":"10.1145\/350391.350432"},{"key":"ref_39","unstructured":"Android Developers (2025, September 25). UI\/Application Exerciser Monkey. Available online: https:\/\/developer.android.com\/studio\/test\/other-testing-tools\/monkey."},{"key":"ref_40","unstructured":"OpenAI (2025, November 02). GPT-4-Turbo Pricing and Token Usage Documentation. Available online: https:\/\/openai.com\/pricing."},{"key":"ref_41","unstructured":"(2025, November 10). Claude API Pricing. Available online: https:\/\/www.claude.com\/pricing#api."},{"key":"ref_42","unstructured":"(2025, November 10). xAI API Models and Pricing. Available online: https:\/\/docs.x.ai\/docs\/models."},{"key":"ref_43","unstructured":"The OWASP Foundation (2025, November 05). OWASP AppSensor Project. Available online: https:\/\/owasp.org\/www-project-appsensor\/."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1007\/s42979-022-01271-1","article-title":"Automating the detection of access control vulnerabilities in web applications","volume":"3","author":"Rennhard","year":"2022","journal-title":"SN Comput. Sci."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1016\/j.procs.2023.12.074","article-title":"Web scraping using natural language processing: Exploiting unstructured text for data extraction and analysis","volume":"230","author":"Pichiyan","year":"2023","journal-title":"Procedia Comput. Sci."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/11\/501\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T09:08:09Z","timestamp":1763543289000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/11\/501"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,18]]},"references-count":45,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["computers14110501"],"URL":"https:\/\/doi.org\/10.3390\/computers14110501","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,18]]}}}