{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T22:50:39Z","timestamp":1779317439402,"version":"3.51.4"},"reference-count":106,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,12,4]],"date-time":"2025-12-04T00:00:00Z","timestamp":1764806400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Large Language Models (LLMs) are emerging technologies and a growing research trend in Artificial General Intelligence (AGI), which envisions a future where machines can think and learn like humans across a wide range of tasks. Information generated by LLMs is essentially the prediction of next tokens in Natural Language Processing (NLP) tasks. However, the generated content is always subject to issues of truthfulness and hallucinations. The information and knowledge integrity of LLM-generated content therefore remains subjective. Exploring recent literature on the integrity of LLMs in a systematic manner is both timely and essential. Moreover, ensuring the reliability of LLMs in real-world applications is critical. Various approaches have been explored to promote information and knowledge integrity in LLMs, including adversarial training, data augmentation, and calibration methods. However, beyond these techniques, other strategies also contribute to maintaining knowledge integrity. This paper specifically focuses on three such approaches: knowledge distillation, semantic integrity, and provenance tracking, which play essential roles in ensuring that LLMs generate accurate, consistent, and trustworthy information. Knowledge distillation enhances model efficiency by transferring knowledge from larger models to smaller ones while preserving essential learning without compromising knowledge integrity. This reduces hallucinations. Semantic integrity safeguards consistency and strengthens the robustness of generated outputs. It is concurrently checking the meaningfulness of the outputs with the context. Provenance tracking improves transparency and trustworthiness through mechanisms such as data lineage and explainability, thereby ensuring the credibility of the LLM-generated responses. This review suggests that knowledge distillation, semantic integrity, and provenance tracking can enhance the reliability of LLM outputs, with prior studies reporting reductions in hallucination rates, improvements in robustness, and gains in factual consistency.<\/jats:p>","DOI":"10.3390\/info16121076","type":"journal-article","created":{"date-parts":[[2025,12,4]],"date-time":"2025-12-04T15:14:06Z","timestamp":1764861246000},"page":"1076","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Knowledge Integrity in Large Language Models: A State-of-The-Art Review"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-5069-6974","authenticated-orcid":false,"given":"Vadivel","family":"Abishethvarman","sequence":"first","affiliation":[{"name":"Department of Computing and Information Systems, Faculty of Computing, Sabaragamuwa University of Sri Lanka, Belihuloya 70140, Sri Lanka"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8455-2499","authenticated-orcid":false,"given":"Fariza","family":"Sabrina","sequence":"additional","affiliation":[{"name":"School of Engineering and Technology, Central Queensland University, Rockhampton, QLD 4701, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4959-5274","authenticated-orcid":false,"given":"Paul","family":"Kwan","sequence":"additional","affiliation":[{"name":"School of Engineering and Technology, Central Queensland University, Rockhampton, QLD 4701, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,12,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Kulkarni, P., Mahabaleshwarkar, A., Kulkarni, M., Sirsikar, N., and Gadgil, K. (2019, January 19\u201321). Conversational AI: An overview of methodologies, applications & future scope. Proceedings of the IEEE 2019 5th International Conference on Computing, Communication, Control and Automation (ICCUBEA), Pune, India.","DOI":"10.1109\/ICCUBEA47591.2019.9129347"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Gao, J., Galley, M., and Li, L. (2018, January 8\u201312). Neural approaches to conversational AI. Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA.","DOI":"10.1145\/3209978.3210183"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Kumar, P., Manikandan, S., and Kishore, R. (2024, January 2\u20134). Ai-driven Text Generation: A Novel Gpt-based Approach for Automated Content Creation. Proceedings of the IEEE 2024 2nd International Conference on Networking and Communications (ICNWC), Chennai, India.","DOI":"10.1109\/ICNWC60771.2024.10537562"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"100211","DOI":"10.1016\/j.hcc.2024.100211","article-title":"A survey on large language model (llm) security and privacy: The good, the bad, and the ugly","volume":"4","author":"Yao","year":"2024","journal-title":"High-Confid. Comput."},{"key":"ref_5","unstructured":"Xu, X., Kong, K., Liu, N., Cui, L., Wang, D., Zhang, J., and Kankanhalli, M. (2023). An llm can fool itself: A prompt-based adversarial attack. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wu, Y., Li, Z., Zhang, J.M., and Liu, Y. (2023). Condefects: A new dataset to address the data leakage concern for llm-based fault localization and program repair. arXiv.","DOI":"10.1145\/3663529.3663815"},{"key":"ref_7","unstructured":"Chen, M., Wei, L., Cao, H., Zhou, W., and Hu, S. (2023). Can large language models understand content and propagation for misinformation detection: An empirical study. arXiv."},{"key":"ref_8","first-page":"1502","article-title":"Efficient adversarial training in llms with continuous attacks","volume":"37","author":"Xhonneux","year":"2024","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhang, J., Mo, F., Wang, D., Fu, Y., and Liu, K. (2025). LEKA: LLM-Enhanced Knowledge Augmentation. arXiv.","DOI":"10.24963\/ijcai.2025\/781"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Wang, Z., Shi, Z., Zhou, H., Gao, S., Sun, Q., and Li, J. (2025). Towards Objective Fine-tuning: How LLMs\u2019 Prior Knowledge Causes Potential Poor Calibration?. arXiv.","DOI":"10.18653\/v1\/2025.acl-long.722"},{"key":"ref_11","unstructured":"Hu, S., Zou, G., Yang, S., Lin, S., Gan, Y., Zhang, B., and Chen, Y. (March, January 25). Large language model meets graph neural network in knowledge distillation. Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Rajan, S.S., Soremekun, E., and Chattopadhyay, S. (2024). Knowledge-based consistency testing of large language models. arXiv.","DOI":"10.18653\/v1\/2024.findings-emnlp.596"},{"key":"ref_13","unstructured":"Wang, J., Lu, X., Zhao, Z., Dai, Z., Foo, C.S., Ng, S.K., and Low, B.K.H. (2023). Source Attribution for Large Language Model-Generated Data. arXiv."},{"key":"ref_14","unstructured":"Xu, X., Li, M., Tao, C., Shen, T., Cheng, R., Li, J., Xu, C., Tao, D., and Zhou, T. (2024). A survey on knowledge distillation of large language models. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3048","DOI":"10.1109\/TPAMI.2021.3055564","article-title":"Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks","volume":"44","author":"Wang","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","unstructured":"You, D., and Chon, D. (2024). Trust & Safety of LLMs and LLMs in Trust & Safety. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3641289","article-title":"A survey on evaluation of large language models","volume":"15","author":"Chang","year":"2024","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_18","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_19","unstructured":"Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2025, November 10). Improving Language Understanding by Generative Pre-training. Available online: https:\/\/cdn.openai.com\/research-covers\/language-unsupervised\/language_understanding_paper.pdf."},{"key":"ref_20","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"ref_21","unstructured":"Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., and Anadkat, S. (2023). Gpt-4 technical report. arXiv."},{"key":"ref_22","unstructured":"Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., and Bhosale, S. (2023). Llama 2: Open foundation and fine-tuned chat models. arXiv."},{"key":"ref_23","unstructured":"Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivi\u00e8re, M., Kale, M.S., and Love, J. (2024). Gemma: Open models based on gemini research and technology. arXiv."},{"key":"ref_24","unstructured":"Team, G., Anil, R., Borgeaud, S., Alayrac, J.B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A.M., Hauth, A., and Millican, K. (2023). Gemini: A family of highly capable multimodal models. arXiv."},{"key":"ref_25","unstructured":"Team, G., Georgiev, P., Lei, V.I., Burnell, R., Bai, L., Gulati, A., Tanzer, G., Vincent, D., Pan, Z., and Wang, S. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv."},{"key":"ref_26","first-page":"1","article-title":"Palm: Scaling language modeling with pathways","volume":"24","author":"Chowdhery","year":"2023","journal-title":"J. Mach. Learn. Res."},{"key":"ref_27","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), (NAACL-HLT 2019), Minneapolis, MN, USA."},{"key":"ref_28","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_29","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_30","unstructured":"Zhong, T., Liu, Z., Pan, Y., Zhang, Y., Zhou, Y., Liang, S., Wu, Z., Lyu, Y., Shu, P., and Yu, X. (2024). Evaluation of openai o1: Opportunities and challenges of agi. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Nazi, Z.A., and Peng, W. (2024). Large language models in healthcare and medical domain: A review. Informatics, 11.","DOI":"10.3390\/informatics11030057"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Shool, S., Adimi, S., Saboori Amleshi, R., Bitaraf, E., Golpira, R., and Tara, M. (2025). A systematic review of large language model (LLM) evaluations in clinical medicine. BMC Med. Inform. Decis. Mak., 25.","DOI":"10.1186\/s12911-025-02954-4"},{"key":"ref_33","unstructured":"Chen, Z.Z., Ma, J., Zhang, X., Hao, N., Yan, A., Nourbakhsh, A., Yang, X., McAuley, J., Petzold, L., and Wang, W.Y. (2024). A survey on large language models for critical societal domains: Finance, healthcare, and law. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Chu, Z., Wang, S., Xie, J., Zhu, T., Yan, Y., Ye, J., Zhong, A., Hu, X., Liang, J., and Yu, P.S. (2025). Llm agents for education: Advances and applications. arXiv.","DOI":"10.18653\/v1\/2025.findings-emnlp.743"},{"key":"ref_35","unstructured":"Gong, C., Li, Z., and Li, X. (2025). Information security based on llm approaches: A review. arXiv."},{"key":"ref_36","unstructured":"Kumar, S.S., Cummings, M., and Stimpson, A. (2024, January 15\u201317). Strengthening LLM trust boundaries: A survey of prompt injection attacks. Proceedings of the 2024 IEEE 4th International Conference on Human-Machine Systems (ICHMS), Toronto, ON, Canada."},{"key":"ref_37","unstructured":"Zhao, P., Zhu, W., Jiao, P., Gao, D., and Wu, O. (2025). Data poisoning in deep learning: A survey. arXiv."},{"key":"ref_38","unstructured":"Fang, H., Qiu, Y., Yu, H., Yu, W., Kong, J., Chong, B., Chen, B., Wang, X., Xia, S.T., and Xu, K. (2024). Privacy leakage on dnns: A survey of model inversion attacks and defenses. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Leybzon, D., and Kervadec, C. (2024, January 15). Learning, forgetting, remembering: Insights from tracking llm memorization during training. Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, Miami, FL, USA.","DOI":"10.18653\/v1\/2024.blackboxnlp-1.4"},{"key":"ref_40","unstructured":"Pan, J.Z., Razniewski, S., Kalo, J.C., Singhania, S., Chen, J., Dietze, S., Jabeen, H., Omeliyanenko, J., Zhang, W., and Lissandrini, M. (2023). Large language models and knowledge graphs: Opportunities and challenges. arXiv."},{"key":"ref_41","unstructured":"Du, H., Li, W., Cai, M., Saraipour, K., Zhang, Z., Lakkaraju, H., Sun, Y., and Zhang, S. (2025). How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1007\/s41019-025-00285-y","article-title":"Large language model enhanced knowledge representation learning: A survey","volume":"10","author":"Wang","year":"2025","journal-title":"Data Sci. Eng."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1007\/s43681-023-00289-2","article-title":"Auditing large language models: A three-layered approach","volume":"4","author":"Schuett","year":"2024","journal-title":"AI Ethics"},{"key":"ref_44","unstructured":"Veldanda, A.K., Zhang, S.X., Das, A., Chakraborty, S., Rawls, S., Sahu, S., and Naphade, M. (2024). Llm surgery: Efficient knowledge unlearning and editing in large language models. arXiv."},{"key":"ref_45","first-page":"1","article-title":"Knowledge editing for large language models: A survey","volume":"57","author":"Wang","year":"2024","journal-title":"ACM Comput. Surv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Xu, R., Qi, Z., Guo, Z., Wang, C., Wang, H., Zhang, Y., and Xu, W. (2024). Knowledge conflicts for llms: A survey. arXiv.","DOI":"10.18653\/v1\/2024.emnlp-main.486"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1789","DOI":"10.1007\/s11263-021-01453-z","article-title":"Knowledge distillation: A survey","volume":"129","author":"Gou","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3711121","article-title":"Knowledge distillation on graphs: A survey","volume":"57","author":"Tian","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_49","unstructured":"Zuo, F., Rhee, J., and Choe, Y.R. (2025). Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection. arXiv."},{"key":"ref_50","first-page":"1","article-title":"Survey on knowledge distillation for large language models: Methods, evaluation, and application","volume":"15","author":"Yang","year":"2024","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_51","unstructured":"Ghosh, B., Hasan, S., Arafat, N.A., and Khan, A. (2024). Logical Consistency of Large Language Models in Fact-checking. arXiv."},{"key":"ref_52","unstructured":"Wang, C., Liu, X., Yue, Y., Tang, X., Zhang, T., Jiayang, C., Yao, Y., Gao, W., Hu, X., and Qi, Z. (2023). Survey on factuality in large language models: Knowledge, retrieval and domain-specificity. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Wang, M., Stoll, A., Lange, L., Adel, H., Sch\u00fctze, H., and Str\u00f6tgen, J. (2025). Bring Your Own Knowledge: A Survey of Methods for LLM Knowledge Expansion. arXiv.","DOI":"10.18653\/v1\/2025.l2m2-1.12"},{"key":"ref_54","unstructured":"Li, M., Zhao, Y., Deng, Y., Zhang, W., Li, S., Xie, W., Ng, S.K., and Chua, T.S. (2024). Knowledge Boundary of Large Language Models: A Survey. arXiv."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1007\/s10462-024-10824-0","article-title":"A survey of safety and trustworthiness of large language models through the lens of verification and validation","volume":"57","author":"Huang","year":"2024","journal-title":"Artif. Intell. Rev."},{"key":"ref_56","unstructured":"Kitchenham, B. (2004). Procedures for Performing Systematic Reviews, Keele University."},{"key":"ref_57","unstructured":"Petticrew, M., and Roberts, H. (2008). Systematic Reviews in the Social Sciences: A Practical Guide, John Wiley & Sons."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"1596","DOI":"10.1111\/cobi.12541","article-title":"Making literature reviews more reliable through application of lessons from systematic reviews","volume":"29","author":"Haddaway","year":"2015","journal-title":"Conserv. Biol."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1080\/19439342.2012.711342","article-title":"The benefits and challenges of using systematic reviews in international development research","volume":"4","author":"Mallett","year":"2012","journal-title":"J. Dev. Eff."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"n71","DOI":"10.1136\/bmj.n71","article-title":"The PRISMA 2020 statement: An updated guideline for reporting systematic reviews","volume":"372","author":"Page","year":"2021","journal-title":"BMJ"},{"key":"ref_61","unstructured":"Shirgaonkar, A., Pandey, N., Abay, N.C., Aktas, T., and Aski, V. (2024). Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data. arXiv."},{"key":"ref_62","unstructured":"Chiruzzo, L., Ritter, A., and Wang, L. (2025). Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data. Findings of the Association for Computational Linguistics, Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, NM, USA, 29 April\u20134 May 2025, Association for Computational Linguistics."},{"key":"ref_63","unstructured":"Hu, C., Li, X., Liu, D., Wu, H., Chen, X., Wang, J., and Liu, X. (2023). Teacher-student architecture for knowledge distillation: A survey. arXiv."},{"key":"ref_64","first-page":"98297","article-title":"Ddk: Distilling domain knowledge for efficient large language models","volume":"37","author":"Liu","year":"2024","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_65","unstructured":"Nguyen, H., He, Z., Gandre, S.A., Pasupulety, U., Shivakumar, S.K., and Lerman, K. (2025). Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation. arXiv."},{"key":"ref_66","unstructured":"Gu, Y., Dong, L., Wei, F., and Huang, M. (2023). MiniLLM: Knowledge distillation of large language models. arXiv."},{"key":"ref_67","unstructured":"Anshumann, A., Zaidi, M.A., Kedia, A., Ahn, J., Kwon, T., Lee, K., Lee, H., and Lee, J. (August, January 27). Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vienna, Austria."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Chen, D., Zhang, S., Gao, F., Zhuang, Y., Tang, S., Liu, Q., and Xu, M. (2024). Logic Distillation: Learning from Code Function by Function for Planning and Decision-making. arXiv.","DOI":"10.24963\/ijcai.2024\/816"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Yang, Y., Tian, B., Yu, F., and He, Y. (2024, January 9\u201312). An Anomaly Detection Model Training Method Based on LLM Knowledge Distillation. Proceedings of the IEEE 2024 International Conference on Networking and Network Applications (NaNA), Yinchuan City, China.","DOI":"10.1109\/NaNA63151.2024.00084"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Di Palo, F., Singhi, P., and Fadlallah, B. (2024). Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale. arXiv.","DOI":"10.18653\/v1\/2024.emnlp-main.215"},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Lee, T., Bang, J., Kwon, S., and Kim, T. (2025). Multi-aspect Knowledge Distillation with Large Language Model. arXiv.","DOI":"10.1109\/CVPRW67362.2025.00199"},{"key":"ref_72","unstructured":"Wu, T., Tao, C., Wang, J., Yang, R., Zhao, Z., and Wong, N. (2024). Rethinking kullback-leibler divergence in knowledge distillation for large language models. arXiv."},{"key":"ref_73","unstructured":"Song, Y., Zhang, J., Tian, Z., Yang, Y., Huang, M., and Li, D. (2024). LLM-based privacy data augmentation guided by knowledge distillation with a distribution tutor for medical text classification. arXiv."},{"key":"ref_74","unstructured":"Li, L., Gou, J., Yu, B., Du, L., and Tao, Z.Y.D. (2024). Federated distillation: A survey. arXiv."},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Qin, L., Zhu, T., Zhou, W., and Yu, P.S. (2024). Knowledge distillation in federated learning: A survey on long lasting challenges and new solutions. arXiv.","DOI":"10.1155\/int\/7406934"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Huangpu, Q., and Gao, H. (2025, November 10). Efficient Model Compression and Knowledge Distillation on Llama 2: Achieving High Performance with Reduced Computational Cost. Available online: https:\/\/osf.io\/preprints\/osf\/hax36.","DOI":"10.31219\/osf.io\/hax36"},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Du, D., Zhang, Y., Cao, S., Guo, J., Cao, T., Chu, X., and Xu, N. (2024). Bitdistiller: Unleashing the potential of sub-4-bit llms via self-distillation. arXiv.","DOI":"10.18653\/v1\/2024.acl-long.7"},{"key":"ref_78","unstructured":"Fan, A., Stock, P., Graham, B., Grave, E., Gribonval, R., Jegou, H., and Joulin, A. (2020). Training with quantization noise for extreme model compression. arXiv."},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Liu, Z., Oguz, B., Zhao, C., Chang, E., Stock, P., Mehdad, Y., Shi, Y., Krishnamoorthi, R., and Chandra, V. (2023). Llm-qat: Data-free quantization aware training for large language models. arXiv.","DOI":"10.18653\/v1\/2024.findings-acl.26"},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Latif, E., Fang, L., Ma, P., and Zhai, X. (2023). Knowledge distillation of LLM for automatic scoring of science education assessments. arXiv.","DOI":"10.1007\/978-3-031-64312-5_20"},{"key":"ref_81","doi-asserted-by":"crossref","unstructured":"Zheng, D., Li, J., Yang, Y., Wang, Y., and Pang, P.C.I. (2024). MicroBERT: Distilling MoE-Based Knowledge from BERT into a Lighter Model. Appl. Sci., 14.","DOI":"10.3390\/app14146171"},{"key":"ref_82","unstructured":"Sreenivas, S.T., Muralidharan, S., Joshi, R., Chochowski, M., Mahabaleshwarkar, A.S., Shen, G., Zeng, J., Chen, Z., Suhara, Y., and Diao, S. (2024). Llm pruning and distillation in practice: The minitron approach. arXiv."},{"key":"ref_83","first-page":"41076","article-title":"Compact language models via pruning and knowledge distillation","volume":"37","author":"Muralidharan","year":"2024","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_84","unstructured":"Mansourian, A.M., Ahmadi, R., Ghafouri, M., Babaei, A.M., Golezani, E.B., Ghamchi, Z.Y., Ramezanian, V., Taherian, A., Dinashi, K., and Miri, A. (2025). A Comprehensive Survey on Knowledge Distillation. arXiv."},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Zhao, Z., Xie, Z., Zhou, G., and Huang, J.X. (2024, January 14\u201318). MTMS: Multi-teacher Multi-stage Knowledge Distillation for Reasoning-Based Machine Reading Comprehension. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA.","DOI":"10.1145\/3626772.3657824"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Tian, Y., Han, Y., Chen, X., Wang, W., and Chawla, N.V. (2024). Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation. arXiv.","DOI":"10.1145\/3701551.3703577"},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"10555","DOI":"10.1109\/TPAMI.2023.3257546","article-title":"When object detection meets knowledge distillation: A survey","volume":"45","author":"Li","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_88","doi-asserted-by":"crossref","unstructured":"Yang, M., Chen, Y., Liu, Y., and Shi, L. (2024, January 16\u201320). DistillSeq: A Framework for Safety Alignment Testing in Large Language Models Using Knowledge Distillation. Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, Vienna, Austria.","DOI":"10.1145\/3650212.3680304"},{"key":"ref_89","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1109\/TMLCN.2025.3530875","article-title":"Semantic Importance-Aware Communications with Semantic Correction Using Large Language Models","volume":"3","author":"Guo","year":"2025","journal-title":"IEEE Trans. Mach. Learn. Commun. Netw."},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Lee, A.W., Chan, J., Fu, M., Kim, N., Mehta, A., Raghavan, D., and Cetintemel, U. (2025). Semantic Integrity Constraints: Declarative Guardrails for AI-Augmented Data Processing Systems. arXiv.","DOI":"10.14778\/3749646.3749677"},{"key":"ref_91","unstructured":"Raj, H., Gupta, V., Rosati, D., and Majumdar, S. (2023). Semantic consistency for assuring reliability of large language models. arXiv."},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Galitsky, B., Chernyavskiy, A., and Ilvovsky, D. (2024, January 14\u201318). Truth-o-meter: Handling multiple inconsistent sources repairing LLM hallucinations. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA.","DOI":"10.1145\/3626772.3657679"},{"key":"ref_93","doi-asserted-by":"crossref","unstructured":"Roe, A., Richardson, S., Schneider, J., Cummings, A., Forsberg, N., and Klein, J. (2024). Semantic drift mitigation in large language model knowledge retention using the residual knowledge stability concept. TechRxiv Preprint.","DOI":"10.36227\/techrxiv.173091142.28945162\/v1"},{"key":"ref_94","unstructured":"Yao, J., Sun, H., and Xue, N. (2025). Fact-checking AI-generated news reports: Can LLMs catch their own lies?. arXiv."},{"key":"ref_95","unstructured":"Chanenson, J., Pickering, M., and Apthorpe, N. (2023). Automating governing knowledge commons and contextual integrity (GKC-CI) privacy policy annotations with large language models. arXiv."},{"key":"ref_96","doi-asserted-by":"crossref","unstructured":"Zhu, K., Wang, J., Zhou, J., Wang, Z., Chen, H., Wang, Y., Yang, L., Ye, W., Zhang, Y., and Gong, N. (2023, January 14\u201318). Promptrobust: Towards evaluating the robustness of large language models on adversarial prompts. Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis, Salt Lake City, UT, USA.","DOI":"10.1145\/3689217.3690621"},{"key":"ref_97","unstructured":"Liu, S., Chen, J., Ruan, S., Su, H., and Yin, Z. (November, January 28). Exploring the robustness of decision-level through adversarial attacks on llm-based embodied models. Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, VIC, Australia."},{"key":"ref_98","doi-asserted-by":"crossref","unstructured":"Zou, J., Zhang, S., and Qiu, M. (2024, January 16\u201318). Adversarial attacks on large language models. Proceedings of the International Conference on Knowledge Science, Engineering and Management, Birmingham, UK.","DOI":"10.1007\/978-981-97-5501-1_7"},{"key":"ref_99","doi-asserted-by":"crossref","unstructured":"Wang, C., Zhang, W., Su, Z., Xu, X., and Zhang, X. (2024, January 12\u201316). Sanitizing Large Language Models in Bug Detection with Data-Flow. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, FL, USA.","DOI":"10.18653\/v1\/2024.findings-emnlp.217"},{"key":"ref_100","unstructured":"Singh, A., Singh, N., and Vatsal, S. (2024). Robustness of llms to perturbations in text. arXiv."},{"key":"ref_101","first-page":"1","article-title":"Bias testing and mitigation in llm-based code generation","volume":"34","author":"Huang","year":"2025","journal-title":"ACM Trans. Softw. Eng. Methodol."},{"key":"ref_102","unstructured":"Peng, B., Chen, K., Li, M., Feng, P., Bi, Z., Liu, J., and Niu, Q. (2024). Securing large language models: Addressing bias, misinformation, and prompt attacks. arXiv."},{"key":"ref_103","doi-asserted-by":"crossref","unstructured":"Ecker, J.E. (2025, January 6\u201310). Explainable AI for Large Language Models via Context-Aware Word Embeddings. Proceedings of the 2025 AIAA Science and Technology Forum and Exposition (AIAA SCITECH Forum), Orlando, FL, USA.","DOI":"10.2514\/6.2025-1916"},{"key":"ref_104","unstructured":"Mumuni, F., and Mumuni, A. (2025). Explainable artificial intelligence (XAI): From inherent explainability to large language models. arXiv."},{"key":"ref_105","unstructured":"Marks, S., Treutlein, J., Bricken, T., Lindsey, J., Marcus, J., Mishra-Sharma, S., Ziegler, D., Ameisen, E., Batson, J., and Belonax, T. (2025). Auditing language models for hidden objectives. arXiv."},{"key":"ref_106","unstructured":"Singh, S., and Vorster, L. (2024, January 5\u20136). LLM Supply Chain Provenance: A Blockchain-Based Approach. Proceedings of the International Conference on AI Research, Lisbon, Portugal."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/12\/1076\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,4]],"date-time":"2025-12-04T15:28:25Z","timestamp":1764862105000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/12\/1076"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,4]]},"references-count":106,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["info16121076"],"URL":"https:\/\/doi.org\/10.3390\/info16121076","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,4]]}}}