{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T04:52:35Z","timestamp":1769835155362,"version":"3.49.0"},"reference-count":61,"publisher":"China Science Publishing & Media Ltd.","issue":"2","license":[{"start":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T00:00:00Z","timestamp":1712880000000},"content-version":"vor","delay-in-days":102,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,5,1]]},"abstract":"<jats:title>ABSTRACT<\/jats:title>\n               <jats:p>The exption of Chinese natural language processing (NLP) has stimulated research in the broader NLP domain. However, existing large language models have limitations in comprehending and reasoning in Chinese. This paper addresses these limitations by enhancing Chinese language models comprehension and reasoning capabilities while minimizing resource requirements. We propose LLaMA-LoRA, a neural prompt engineering framework that builds upon the LLaMA-13B model and incorporates the Low-Rank Adaptation (LoRA) of Large Language Models technique for refinement. Chain-of-Thought (CoT) are crucial for generating intermediate reasoning chains in language models, but their effectiveness can be limited by isolated language patterns. Erroneous reasoning resulting from conventional prompts negatively impacts model performance. Automatic prompts are introduced to encourage reasoning chain generation and accurate answer inference. Training the model with an extensive corpus of Chinese CoT data enhances its comprehension and reasoning abilities. The LLaMA-LoRA model demonstrates exceptional performance across numerous Chinese language tasks, surpassing benchmark performance achieved by related language models such as GPT-3.5, Chat-GLM, and OpenAssistant, delivering accurate, comprehensive, and professional answers. The availability of our open-source model code facilitates further research in the field of Chinese text logical reasoning thinking chains.<\/jats:p>","DOI":"10.1162\/dint_a_00251","type":"journal-article","created":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T19:18:45Z","timestamp":1712949525000},"page":"375-408","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":5,"title":["LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains"],"prefix":"10.3724","volume":"6","author":[{"given":"Songlin","family":"Chen","sequence":"first","affiliation":[{"name":"School of Computer and Software Engineering, Xihua University, chengdu 610039, P.R. china"}]},{"given":"Weicheng","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer and Software Engineering, Xihua University, chengdu 610039, P.R. china"}]},{"given":"Xiaoliang","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computer and Software Engineering, Xihua University, chengdu 610039, P.R. china"},{"name":"Department of Computer Science and Operations Research, University of Montreal, Montreal, QC H3C3J7, Canada"}]},{"given":"Peng","family":"lu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Operations Research, University of Montreal, Montreal, QC H3C3J7, Canada"}]},{"given":"Zaiyan","family":"Yang","sequence":"additional","affiliation":[{"name":"College of Artificial intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China"}]},{"given":"Yajun","family":"Du","sequence":"additional","affiliation":[{"name":"School of Computer and Software Engineering, Xihua University, chengdu 610039, P.R. china"}]}],"member":"2026","published-online":{"date-parts":[[2024,5,1]]},"reference":[{"issue":"8","key":"2024071119550965200_ref1","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI blog"},{"key":"2024071119550965200_ref2","volume-title":"LLaMA: Open and Efficient Foundation LanguageModels","author":"Touvron","year":"2023"},{"key":"2024071119550965200_ref3","article-title":"Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%*","volume-title":"ChatGPT Quality","author":"Chiang","year":"2023"},{"key":"2024071119550965200_ref4","volume-title":"Opt: Open pre-trained transformer language models","author":"Zhang","year":"2022"},{"key":"2024071119550965200_ref5","volume-title":"Bloom: A 176b-parameter open-access multilingual language model","author":"Scao","year":"2022"},{"key":"2024071119550965200_ref6","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Advances in neural information processing systems"},{"issue":"4","key":"2024071119550965200_ref7","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1162\/dint_a_00232","article-title":"Evaluation on ChatGPT for Chinese Language Understanding","volume":"5","author":"Li","year":"2023","journal-title":"Data Intelligence"},{"key":"2024071119550965200_ref8","volume-title":"Training compute-optimal large language models","author":"Hoffmann","year":"2022"},{"key":"2024071119550965200_ref9","first-page":"5859","article-title":"Math word problem solving with explicit numerical values","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Wu","year":"2021"},{"key":"2024071119550965200_ref10","volume-title":"On the advance of making language models better reasoners","author":"Li","year":"2022"},{"key":"2024071119550965200_ref11","first-page":"101525","article-title":"Improving bert with local context comprehension for multi-turn response selection in retrieval-based dialogue systems","volume-title":"Computer Speech&Language","author":"Chen","year":"2023"},{"key":"2024071119550965200_ref12","doi-asserted-by":"crossref","first-page":"101412","DOI":"10.1016\/j.csl.2022.101412","article-title":"Multi-level context features extraction for named entity recognition","volume":"77","author":"Chang","year":"2023","journal-title":"Computer Speech & Language"},{"key":"2024071119550965200_ref13","article-title":"GPT understands, too","volume":"abs\/2103.10385","author":"Liu","year":"2021","journal-title":"CoRR"},{"key":"2024071119550965200_ref14","first-page":"4582","article-title":"Prefix-tuning: Optimizing continuous prompts for generation","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021, (Volume 1: Long Papers)","author":"Li","year":"2021"},{"key":"2024071119550965200_ref15","first-page":"12697","article-title":"Calibrate before use: Improving few-shot performance of language models","volume-title":"International Conference on Machine Learning","author":"Zhao","year":"2021"},{"key":"2024071119550965200_ref16","volume-title":"Chain of thought prompting elicits reasoning in large language models","author":"Wei","year":"2022"},{"key":"2024071119550965200_ref17","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1016\/j.future.2020.02.023","article-title":"Diversified top-kmaximal clique detection in social internet of things","volume":"107","author":"Hao","year":"2020","journal-title":"Future Generation Computer Systems"},{"key":"2024071119550965200_ref18","volume-title":"Training verifiers to solve math word problems","author":"Cobbe","year":"2021"},{"key":"2024071119550965200_ref19","first-page":"1152","article-title":"MAWPS: A math word problem repository","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Koncel-Kedziorski","year":"2016"},{"key":"2024071119550965200_ref20","volume-title":"Lora: Low-rank adaptation of large language models","author":"Hu","year":"2021"},{"key":"2024071119550965200_ref21","first-page":"2790","article-title":"Parameter-efficient transfer learning for NLP","volume":"97","author":"Houlsby","year":"2019","journal-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019. Proceedings of Machine Learning Research"},{"key":"2024071119550965200_ref22","first-page":"506","article-title":"Learning multiple visual domains with residual adapters","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Rebuffi","year":"2017"},{"key":"2024071119550965200_ref23","doi-asserted-by":"crossref","first-page":"7930","DOI":"10.18653\/v1\/2021.emnlp-main.626","article-title":"Adapterdrop: On the efficiency of adapters in transformers","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021","author":"R\u00fcckl\u00e9","year":"2021"},{"key":"2024071119550965200_ref24","first-page":"1022","article-title":"Compacter: Efficient low-rank hypercomplex adapter layers","volume-title":"Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021","author":"Mahabadi","year":"2021"},{"key":"2024071119550965200_ref25","doi-asserted-by":"crossref","first-page":"3045","DOI":"10.18653\/v1\/2021.emnlp-main.243","article-title":"The power of scale for parameter-efficient prompt tuning","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Republic, 7-11","author":"Lester","year":"2021"},{"key":"2024071119550965200_ref26","first-page":"4921","article-title":"WARP: word-level adversarial reprogramming","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021, (Volume 1: Long Papers)","author":"Hambardzumyan","year":"2021"},{"key":"2024071119550965200_ref27","first-page":"2358","article-title":"Recovery guarantee of weighted low-rank approximation via alternating minimization","volume-title":"Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, 48","author":"Li","year":"2016"},{"issue":"4","key":"2024071119550965200_ref28","doi-asserted-by":"crossref","first-page":"1956","DOI":"10.1137\/080738970","article-title":"A singular value thresholding algorithm for matrix completion","volume":"20","author":"Cai","year":"2010","journal-title":"SIAM Journal on optimization"},{"key":"2024071119550965200_ref29","first-page":"2","article-title":"Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations","volume-title":"Conference On Learning Theory","author":"Li","year":"2018"},{"issue":"1","key":"2024071119550965200_ref30","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1002\/gamm.201310004","article-title":"A literature survey of low-rank tensor approximation techniques","volume":"36","author":"Grasedyck","year":"2013","journal-title":"GAMM-Mitteilungen"},{"key":"2024071119550965200_ref31","volume-title":"Generalization guarantees for neural networks via harnessing the low-rank structure of the jacobian","author":"Oymak","year":"2019"},{"key":"2024071119550965200_ref32","doi-asserted-by":"crossref","first-page":"6655","DOI":"10.1109\/ICASSP.2013.6638949","article-title":"Low-rank matrix factorization for deep neural network training with high-dimensional output targets","volume-title":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Sainath","year":"2013"},{"key":"2024071119550965200_ref33","first-page":"3743","article-title":"Semi-orthogonal low-rank matrix factorization for deep neural networks","volume-title":"Interspeech","author":"Povey","year":"2018"},{"key":"2024071119550965200_ref34","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1109\/ICASSP.2014.6853583","article-title":"Extracting deep neural network bottleneck features using low-rank matrix factorization","volume-title":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Zhang","year":"2014"},{"key":"2024071119550965200_ref35","doi-asserted-by":"crossref","DOI":"10.5244\/C.28.88","volume-title":"Speeding up convolutional neural networks with low rank expansions","author":"Jaderberg","year":"2014"},{"key":"2024071119550965200_ref36","first-page":"242","article-title":"A convergence theory for deep learning via overparameterization","volume-title":"International Conference on Machine Learning","author":"Allen-Zhu","year":"2019"},{"key":"2024071119550965200_ref37","volume":"31","author":"Li","year":"2018","journal-title":"Learning overparameterized neural networks via stochastic gradient descent on structured data. Advances in neural information processing systems"},{"key":"2024071119550965200_ref38","first-page":"14820","article-title":"When do neural networks outperform kernel methods?","volume":"33","author":"Ghorbani","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2024071119550965200_ref39","article-title":"What can resnet learn efficiently, going beyond kernels?","volume":"32","author":"Allen-Zhu","year":"2019","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2024071119550965200_ref40","volume-title":"Backward feature correction: How deep learning performs deep learning","author":"Allen-Zhu","year":"2020"},{"key":"2024071119550965200_ref41","doi-asserted-by":"crossref","first-page":"977","DOI":"10.1109\/FOCS52979.2021.00098","article-title":"Feature purification: How adversarial training performs robust deep learning","volume-title":"2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)","author":"Allen-Zhu","year":"2022"},{"key":"2024071119550965200_ref42","volume-title":"Chain of thought prompting elicits reasoning in large language models","author":"Wei","year":"2022"},{"key":"2024071119550965200_ref43","volume-title":"Least-to-most prompting enables complex reasoning in large language models","author":"Zhou","year":"2022"},{"key":"2024071119550965200_ref44","volume-title":"Complexity-based prompting for multi-step reasoning","author":"Fu","year":"2022"},{"key":"2024071119550965200_ref45","volume-title":"Rationale-augmented ensembles in language models","author":"Wang","year":"2022"},{"key":"2024071119550965200_ref46","volume-title":"Automatic chain of thought prompting in large language models","author":"Zhang","year":"2022"},{"key":"2024071119550965200_ref47","volume-title":"Stanford Alpaca: An Instruction-following LLaMA model","author":"Taori","year":"2023"},{"issue":"12","key":"2024071119550965200_ref48","doi-asserted-by":"crossref","first-page":"2639","DOI":"10.1162\/0899766042321814","article-title":"Canonical correlation analysis: An overview with application to learning methods","volume":"16","author":"Hardoon","year":"2004","journal-title":"Neural computation"},{"key":"2024071119550965200_ref49","volume-title":"BELLE: Be Everyone's Large Language model Engine","author":"Yunjie Ji","year":"2023"},{"key":"2024071119550965200_ref50","doi-asserted-by":"crossref","DOI":"10.1007\/s11633-024-1502-8","article-title":"MOSS: An Open Conversational Large Language Model","volume-title":"Machine Intelligent Research","author":"Sun","year":"2024"},{"key":"2024071119550965200_ref51","volume-title":"Parrot: Translating during chat using large language models","author":"Jiao","year":"2023"},{"key":"2024071119550965200_ref52","volume-title":"Scaling instruction-finetuned language models","author":"Chung","year":"2022"},{"key":"2024071119550965200_ref53","volume-title":"Glm-130b: An open bilingual pre-trained model","author":"Zeng","year":"2022"},{"key":"2024071119550965200_ref54","volume-title":"Efficient and effective text encoding for chinese llama and alpaca","author":"Cui","year":"2023"},{"key":"2024071119550965200_ref55","volume-title":"Stability AI Language Models","author":"StableLM","year":"2023"},{"key":"2024071119550965200_ref56","volume-title":"Luotuo: An Instruction-following Chinese Language model, LoRA tuning on LLaMA","author":"Ziang Leng","year":"2023"},{"key":"2024071119550965200_ref57","volume-title":"ERNIBOT","year":"2023"},{"key":"2024071119550965200_ref58","volume-title":"Lcsts: A large scale chinese short text summarization dataset","author":"Hu","year":"2015"},{"key":"2024071119550965200_ref59","doi-asserted-by":"crossref","DOI":"10.1162\/tacl_a_00305","article-title":"Investigating prior knowledge for challenging chinese machine reading comprehension","volume-title":"Transactions of the Association for Computational Linguistics","author":"Sun","year":"2020"},{"key":"2024071119550965200_ref60","volume-title":"Dataset and neural recurrent sequence labeling model for open-domain factoid question answering","author":"Li","year":"2016"},{"key":"2024071119550965200_ref61","doi-asserted-by":"crossref","first-page":"778","DOI":"10.18653\/v1\/P19-1075","article-title":"ChID: A large-scale Chinese IDiom dataset for cloze test","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Zheng","year":"2019"}],"container-title":["Data Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/dint\/article-pdf\/6\/2\/375\/2459010\/dint_a_00251.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/dint\/article-pdf\/6\/2\/375\/2459010\/dint_a_00251.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T07:42:52Z","timestamp":1741938172000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.sciengine.com\/doi\/10.1162\/dint_a_00251"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":61,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,5,1]]}},"URL":"https:\/\/doi.org\/10.1162\/dint_a_00251","relation":{},"ISSN":["2641-435X"],"issn-type":[{"value":"2641-435X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]}}}