{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,5]],"date-time":"2026-04-05T20:38:24Z","timestamp":1775421504842,"version":"3.50.1"},"reference-count":64,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2024,2,6]],"date-time":"2024-02-06T00:00:00Z","timestamp":1707177600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Maroun Semaan Faculty of Engineering and Architecture (MSFEA) at the American University of Beirut (AUB)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these advancements, a pivotal question emerges: can these generalized models negate the need for task-specific models? This study addresses this question by comparing the effectiveness of LLMs in detecting phishing URLs when utilized with prompt-engineering techniques versus when fine-tuned. Notably, we explore multiple prompt-engineering strategies for phishing URL detection and apply them to two chat models, GPT-3.5-turbo and Claude 2. In this context, the maximum result achieved was an F1-score of 92.74% by using a test set of 1000 samples. Following this, we fine-tune a range of base LLMs, including GPT-2, Bloom, Baby LLaMA, and DistilGPT-2\u2014all primarily developed for text generation\u2014exclusively for phishing URL detection. The fine-tuning approach culminated in a peak performance, achieving an F1-score of 97.29% and an AUC of 99.56% on the same test set, thereby outperforming existing state-of-the-art methods. These results highlight that while LLMs harnessed through prompt engineering can expedite application development processes, achieving a decent performance, they are not as effective as dedicated, task-specific LLMs.<\/jats:p>","DOI":"10.3390\/make6010018","type":"journal-article","created":{"date-parts":[[2024,2,7]],"date-time":"2024-02-07T08:28:16Z","timestamp":1707294496000},"page":"367-384","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":91,"title":["Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2241-8195","authenticated-orcid":false,"given":"Fouad","family":"Trad","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering, American University of Beirut, Beirut 1107-2020, Lebanon"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1939-2740","authenticated-orcid":false,"given":"Ali","family":"Chehab","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering, American University of Beirut, Beirut 1107-2020, Lebanon"}]}],"member":"1968","published-online":{"date-parts":[[2024,2,6]]},"reference":[{"key":"ref_1","first-page":"347","article-title":"Social Network Mining from Natural Language Text and Event Logs for Compliance Deviation Detection","volume":"Volume 14353","author":"Mustroph","year":"2024","journal-title":"Cooperative Information Systems. CoopIS 2023"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"464","DOI":"10.1007\/978-3-031-45673-2_46","article-title":"Tailoring Large Language Models to Radiology: A Preliminary Approach to LLM Adaptation for a Highly Specialized Domain","volume":"Volume 14348","author":"Liu","year":"2024","journal-title":"Machine Learning in Medical Imaging. MLMI 2023"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"103580","DOI":"10.1016\/j.jretconser.2023.103580","article-title":"GPT and CLT: The impact of ChatGPT\u2019s level of abstraction on consumer recommendations","volume":"76","author":"Kirshner","year":"2024","journal-title":"J. Retail. Consum. Serv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"121186","DOI":"10.1016\/j.eswa.2023.121186","article-title":"Can ChatGPT provide intelligent diagnoses? A comparative study between predictive models and ChatGPT to define a new medical diagnostic bot","volume":"235","author":"Caruccio","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Shi, Y., Ren, P., Wang, J., Han, B., ValizadehAslani, T., Agbavor, F., Zhang, Y., Hu, M., Zhao, L., and Liang, H. (2023). Leveraging GPT-4 for food effect summarization to enhance product-specific guidance development via iterative prompting. J. Biomed. Inform., 148.","DOI":"10.1016\/j.jbi.2023.104533"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1186\/s41239-023-00425-2","article-title":"AI-generated feedback on writing: Insights into efficacy and ENL student preference","volume":"20","author":"Escalante","year":"2023","journal-title":"Int. J. Educ. Technol. High. Educ."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Dhamija, R., Tygar, J.D., and Hearst, M. (2006, January 22\u201327). Why phishing works. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA.","DOI":"10.1145\/1124772.1124861"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1016\/j.eswa.2016.01.028","article-title":"New rule-based phishing detection method","volume":"53","author":"Moghimi","year":"2016","journal-title":"Expert Syst. Appl."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1049\/iet-ifs.2013.0202","article-title":"Intelligent rule-based phishing websites classification","volume":"8","author":"Mohammad","year":"2014","journal-title":"IET Inf. Secur."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/j.eswa.2018.09.029","article-title":"Machine learning based phishing detection from URLs","volume":"117","author":"Sahingoz","year":"2019","journal-title":"Expert Syst. Appl."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"672","DOI":"10.3390\/make3030034","article-title":"A Survey of Machine Learning-Based Solutions for Phishing Website Detection","volume":"3","author":"Tang","year":"2021","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_12","unstructured":"Rocha, A., and Pereira, R.P. (2020). Classification of Phishing Attack Solutions by Employing Deep Learning Techniques: A Systematic Literature Review. Developments and Advances in Defense and Security, Springer. Smart Innovation, Systems and Technologies."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1457","DOI":"10.1007\/s10115-022-01672-x","article-title":"Applications of deep learning for phishing detection: A systematic literature review","volume":"64","author":"Catal","year":"2022","journal-title":"Knowl. Inf. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"36429","DOI":"10.1109\/ACCESS.2022.3151903","article-title":"Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions","volume":"10","author":"Do","year":"2022","journal-title":"IEEE Access"},{"key":"ref_15","unstructured":"White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., and Schmidt, D.C. (2023). A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv."},{"key":"ref_16","unstructured":"Lv, K., Yang, Y., Liu, T., Gao, Q., Guo, Q., and Qiu, X. (2023). Full Parameter Fine-tuning for Large Language Models with Limited Resources. arXiv."},{"key":"ref_17","unstructured":"Hannousse, A., and Yahiouche, S. (2021). Web Page Phishing Detection, Mendeley Data."},{"key":"ref_18","unstructured":"Dolev, S., and Schieber, B. (2023). Machine Learning-Based Phishing Detection Using URL Features: A Comprehensive Review. Stabilization, Safety, and Security of Distributed Systems, Springer. Lecture Notes in Computer Science."},{"key":"ref_19","unstructured":"Zhao, W.X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., and Dong, Z. (2023). A Survey of Large Language Models. arXiv."},{"key":"ref_20","unstructured":"Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., Yin, B., and Hu, X. (2023). Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. arXiv."},{"key":"ref_21","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017). Attention is All you Need. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_22","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2023). Efficient Estimation of Word Representations in Vector Space. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C. (2014, January 25\u201329). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"},{"key":"ref_24","unstructured":"Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training, OpenAI."},{"key":"ref_25","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"ref_26","unstructured":"Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020). Language Models are Few-Shot Learners. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1016\/j.iotcps.2023.04.003","article-title":"ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope","volume":"3","author":"Ray","year":"2023","journal-title":"Internet Things-Cyber-Phys. Syst."},{"key":"ref_28","first-page":"22199","article-title":"Large Language Models are Zero-Shot Reasoners","volume":"35","author":"Kojima","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_29","first-page":"30378","article-title":"The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning","volume":"35","author":"Ye","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kong, A., Zhao, S., Chen, H., Li, Q., Qin, Y., Sun, R., and Zhou, X. (2023). Better Zero-Shot Reasoning with Role-Play Prompting. arXiv.","DOI":"10.18653\/v1\/2024.naacl-long.228"},{"key":"ref_31","unstructured":"Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2023). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hu, Z., Wang, L., Lan, Y., Xu, W., Lim, E.P., Bing, L., Xu, X., Poria, S., and Lee, R.K.W. (2023). LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models. arXiv.","DOI":"10.18653\/v1\/2023.emnlp-main.319"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Howard, J., and Ruder, S. (2018). Universal Language Model Fine-tuning for Text Classification. arXiv.","DOI":"10.18653\/v1\/P18-1031"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, Y., Ma, W., Xu, H., Liu, Y., and Yin, P. (2023). A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts. Appl. Sci., 13.","DOI":"10.3390\/app13137429"},{"key":"ref_35","unstructured":"(2024, January 08). Introducing Cloudflare\u2019s 2023 Phishing Threats Report. Available online: https:\/\/blog.cloudflare.com\/2023-phishing-report."},{"key":"ref_36","unstructured":"Sahoo, D., Liu, C., and Hoi, S.C.H. (2019). Malicious URL Detection using Machine Learning: A Survey. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Woodbridge, J., Anderson, H.S., Ahuja, A., and Grant, D. (2018, January 24). Detecting homoglyph attacks with a siamese neural network. Proceedings of the 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA.","DOI":"10.1109\/SPW.2018.00012"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Sern, L.J., David, Y.G.P., and Hao, C.J. (2020, January 3\u20135). PhishGAN: Data Augmentation and Identification of Homoglyph Attacks. Proceedings of the 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI), Virtual.","DOI":"10.1109\/CCCI49893.2020.9256804"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Hageman, K., Kidmose, E., Hansen, R.R., and Pedersen, J.M. (2021, January 6\u20138). Can a TLS certificate be phishy?. Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021, Online.","DOI":"10.5220\/0010516600002998"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"101855","DOI":"10.1016\/j.cose.2020.101855","article-title":"LogoSENSE: A companion HOG based logo detection scheme for phishing web page and E-mail brand recognition","volume":"95","author":"Bozkir","year":"2020","journal-title":"Comput. Secur."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"101613","DOI":"10.1016\/j.cose.2019.101613","article-title":"Heuristic-based strategy for Phishing prediction: A survey of URL-based approach","volume":"88","author":"Feitosa","year":"2020","journal-title":"Comput. Secur."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Chhabra, S., Aggarwal, A., Benevenuto, F., and Kumaraguru, P. (2011, January 1\u20132). Phi.sh\/$oCiaL: The phishing landscape through short URLs. Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, New York, NY, USA.","DOI":"10.1145\/2030376.2030387"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"107275","DOI":"10.1016\/j.comnet.2020.107275","article-title":"Accurate and fast URL phishing detector: A convolutional neural network approach","volume":"178","author":"Wei","year":"2020","journal-title":"Comput. Netw."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1186\/s13673-017-0098-1","article-title":"A novel lightweight URL phishing detection system using SVM and similarity index","volume":"7","author":"Zouina","year":"2017","journal-title":"Hum.-Centric Comput. Inf. Sci."},{"key":"ref_45","first-page":"45","article-title":"Phishing Website Detection using Machine Learning Algorithms","volume":"181","author":"Mahajan","year":"2018","journal-title":"Int. J. Comput. Appl."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"103288","DOI":"10.1016\/j.advengsoft.2022.103288","article-title":"Phishing URL detection using machine learning methods","volume":"173","author":"Ahammad","year":"2022","journal-title":"Adv. Eng. Softw."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Huang, Y., Yang, Q., Qin, J., and Wen, W. (2019, January 5\u20138). Phishing URL Detection via CNN and Attention-Based Hierarchical RNN. Proceedings of the 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications\/13th IEEE International Conference on Big Data Science and Engineering (TrustCom\/BigDataSE), Rotorua, New Zealand.","DOI":"10.1109\/TrustCom\/BigDataSE.2019.00024"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"e8241104","DOI":"10.1155\/2021\/8241104","article-title":"Hybrid Rule-Based Solution for Phishing URL Detection Using Convolutional Neural Network","volume":"2021","author":"Mourtaji","year":"2021","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_49","unstructured":"Le, H., Pham, Q., Sahoo, D., and Hoi, S.C.H. (2018). URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection. arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Tajaddodianfar, F., Stokes, J.W., and Gururajan, A. (2020, January 4\u20138). Texception: A Character\/Word-Level Deep Learning Model for Phishing URL Detection. Proceedings of the ICASSP 2020\u20142020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9053670"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Lin, X., Ghorbani, A., Ren, K., Zhu, S., and Zhang, A. (2018). A Deep Learning Based Online Malicious URL and DNS Detection Scheme. Security and Privacy in Communication Networks, Springer. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.","DOI":"10.1007\/978-3-319-78816-6"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"4957","DOI":"10.1007\/s00521-021-06401-z","article-title":"A hybrid DNN\u2013LSTM model for detecting phishing URLs","volume":"35","author":"Ozcan","year":"2023","journal-title":"Neural Comput. Appl."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"119723","DOI":"10.1016\/j.eswa.2023.119723","article-title":"Hybrid phishing detection using joint visual and textual identity","volume":"220","author":"Tan","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"104347","DOI":"10.1016\/j.engappai.2021.104347","article-title":"Towards benchmark datasets for machine learning based website phishing detection: An experimental study","volume":"104","author":"Hannousse","year":"2021","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_55","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Mach. Learn. Python"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., and Funtowicz, M. (2020). HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. arXiv.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Timiryasov, I., and Tastet, J.L. (2023). Baby Llama: Knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty. arXiv.","DOI":"10.18653\/v1\/2023.conll-babylm.24"},{"key":"ref_58","unstructured":"Dakle, P.P., Rallabandi, S., and Raghavan, P. (2023). Understanding BLOOM: An empirical study on diverse NLP tasks. arXiv."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Nepal, S., Gurung, H., and Nepal, R. (2022). Phishing URL Detection Using CNN-LSTM and Random Forest Classifier. Preprint.","DOI":"10.21203\/rs.3.rs-2043842\/v2"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., and Zurada, J.M. (2023). Phishing Attack Detection: An Improved Performance Through Ensemble Learning. Artificial Intelligence and Soft Computing, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-031-42508-0"},{"key":"ref_61","first-page":"451","article-title":"Cloud-Based Machine Learning Approach for Accurate Detection of Website Phishing","volume":"11","author":"Rashid","year":"2023","journal-title":"Int. J. Intell. Syst. Appl. Eng."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Uppalapati, P.J., Gontla, B.K., Gundu, P., Hussain, S.M., and Narasimharo, K. (2023). A Machine Learning Approach to Identifying Phishing Websites: A Comparative Study of Classification Models and Ensemble Learning Techniques. ICST Trans. Scalable Inf. Syst., 10.","DOI":"10.4108\/eetsis.vi.3300"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Wang, Y., Zhu, W., Xu, H., Qin, Z., Ren, K., and Ma, W. (2023, January 4\u201310). A Large-Scale Pretrained Deep Model for Phishing URL Detection. Proceedings of the ICASSP 2023\u20142023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece.","DOI":"10.1109\/ICASSP49357.2023.10095719"},{"key":"ref_64","unstructured":"Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., and Rieck, K. (2022, January 10\u201312). Dos and Don\u2019ts of Machine Learning in Computer Security. Proceedings of the 31st USENIX Security Symposium, Boston, MA, USA."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/1\/18\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:55:41Z","timestamp":1760104541000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/1\/18"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,6]]},"references-count":64,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,3]]}},"alternative-id":["make6010018"],"URL":"https:\/\/doi.org\/10.3390\/make6010018","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,6]]}}}