{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T14:20:06Z","timestamp":1773843606699,"version":"3.50.1"},"reference-count":41,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T00:00:00Z","timestamp":1761523200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"2025 Xinjiang Autonomous Region University Fundamental Scientific Research Project","award":["XJEDU2025P043"],"award-info":[{"award-number":["XJEDU2025P043"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>With the advancement of educational informatization, vast amounts of Chinese text are generated across online platforms and digital textbooks. Effectively classifying such text is essential for intelligent education systems. This study conducts a systematic comparative evaluation of three Transformer-based models\u2014TinyBERT-4L, BERT-base-Chinese, and RoBERTa-wwm-ext\u2014for Chinese educational text classification. Using a balanced four-category subset of the THUCNews corpus (Education, Technology, Finance, and Stock), the research investigates the trade-off between classification effectiveness and computational efficiency under a unified experimental framework. The experimental results show that RoBERTa-wwm-ext achieves the highest effectiveness (93.12% Accuracy, 93.08% weighted F1), validating the benefits of whole-word masking and extended pre-training. BERT-base-Chinese maintains a balanced performance (91.74% Accuracy, 91.66% F1) with moderate computational demand. These findings reveal a clear symmetry\u2013asymmetry dynamic: structural symmetry arises from the shared Transformer encoder and identical fine-tuning setup, while asymmetry emerges from differences in model scale and pre-training strategy. This interplay leads to distinct accuracy\u2013latency trade-offs, providing practical guidance for deploying pre-trained language models in resource-constrained intelligent education systems.<\/jats:p>","DOI":"10.3390\/sym17111812","type":"journal-article","created":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T11:47:19Z","timestamp":1761652039000},"page":"1812","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Symmetry and Asymmetry in Pre-Trained Transformer Models: A Comparative Study of TinyBERT, BERT, and RoBERTa for Chinese Educational Text Classification"],"prefix":"10.3390","volume":"17","author":[{"given":"Munire","family":"Muhetaer","sequence":"first","affiliation":[{"name":"College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China"}]},{"given":"Xiaoyan","family":"Meng","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China"}]},{"given":"Jing","family":"Zhu","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China"}]},{"given":"Aixiding","family":"Aikebaier","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China"}]},{"given":"Liyaer","family":"Zu","sequence":"additional","affiliation":[{"name":"College of Mechanical and Electronic Engineering, Xinjiang Agricultural University, Urumqi 830052, China"}]},{"given":"Yawen","family":"Bai","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,27]]},"reference":[{"key":"ref_1","unstructured":"Zhu, S., Xu, S., Sun, H., Pan, L., Cui, M., Du, J., Jin, R., Branco, A., and Xiong, D. (2024). Multilingual large language models: A systematic survey. arXiv."},{"key":"ref_2","unstructured":"Zhu, S., Cui, M., and Xiong, D. (2024, January 20\u201325). Towards robust in-context learning for machine translation with large language models. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, Italy."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"103825","DOI":"10.1016\/j.ipm.2024.103825","article-title":"FEDS-ICL: Enhancing translation ability and efficiency of large language model by optimizing demonstration selection","volume":"61","author":"Zhu","year":"2024","journal-title":"Inf. Process. Manag."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"12135","DOI":"10.18653\/v1\/2024.acl-long.656","article-title":"LANDeRMT: Detecting and routing language-aware neurons for selectively finetuning LLMs to machine translation","volume":"Volume 1","author":"Zhu","year":"2024","journal-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Dong, T., Li, B., Liu, J., Zhu, S., and Xiong, D. (August, January 27). MLAS-LoRA: Language-Aware parameters detection and LoRA-based knowledge transfer for multilingual machine translation. Proceedings of the 2025 Annual Meeting of the Association for Computational Linguistics (Long Papers), Vienna, Austria.","DOI":"10.18653\/v1\/2025.acl-long.762"},{"key":"ref_6","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Cui, Y., Che, W., Liu, T., Qin, B., and Yang, Z. (2020). Revisiting Pre-Trained Models for Chinese NLP (MacBERT). Findings of EMNLP, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2020.findings-emnlp.58"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., and Liu, Q. (2020). TinyBERT: Distilling BERT for Natural Language Understanding. Findings of EMNLP, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2020.findings-emnlp.372"},{"key":"ref_9","unstructured":"Kaplan, J., McCandlish, S., Henighan, T., Brown, T.B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., and Amodei, D. (2020). Scaling laws for neural language models. arXiv."},{"key":"ref_10","unstructured":"Clark, K., Khandelwal, U., Levy, O., and Manning, C.D. (August, January 28). What does BERT look at? An analysis of attention. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3530811","article-title":"Efficient transformers: A survey","volume":"55","author":"Tay","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/505282.505283","article-title":"Machine learning in automated text categorization","volume":"34","author":"Sebastiani","year":"2002","journal-title":"ACM Comput. Surv."},{"key":"ref_13","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Kim, Y. (2014, January 25\u201329). Convolutional neural networks for sentence classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1181"},{"key":"ref_15","unstructured":"Zhang, X., Zhao, J., and LeCun, Y. (2015, January 7\u201312). Character-level convolutional networks for text classification. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_17","unstructured":"Liu, P., Qiu, X., and Huang, X. (2016, January 9\u201315). Recurrent neural network for text classification with multi-task learning. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), New York, NY, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., and Hovy, E. (2016, January 12\u201317). Hierarchical attention networks for document classification. Proceedings of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL-HLT), San Diego, CA, USA.","DOI":"10.18653\/v1\/N16-1174"},{"key":"ref_19","unstructured":"Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the Association for Computational Linguistics-Human Language Technologies (NAACL-HLT), Minneapolis, MN, USA."},{"key":"ref_20","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv."},{"key":"ref_21","unstructured":"Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., and Le, Q.V. (2019, January 8\u201314). XLNet: Generalized autoregressive pretraining for language understanding. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_22","unstructured":"Clark, K., Luong, M.T., Le, Q.V., and Manning, C.D. (May, January 26). ELECTRA: Pre-training text encoders as discriminators rather than generators. Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia."},{"key":"ref_23","unstructured":"Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (May, January 26). ALBERT: A lite BERT for self-supervised learning of language representations. Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia."},{"key":"ref_24","unstructured":"He, P., Liu, X., Gao, J., and Chen, W. (2021, January 3\u20137). DeBERTa: Decoding-enhanced BERT with disentangled attention. Proceedings of the International Conference on Learning Representations (ICLR), Virtually."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"3504","DOI":"10.1109\/TASLP.2021.3124365","article-title":"Pre-Training with Whole Word Masking for Chinese BERT","volume":"29","author":"Cui","year":"2021","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gururangan, S., Marasovi\u0107, A., Swayamdipta, S., Lo, K., Beltagy, I., Downey, D., and Smith, N.A. (2020, January 9\u201311). Don\u2019t stop pretraining: Adapt language models to domains and tasks. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Seattle, WA, USA.","DOI":"10.18653\/v1\/2020.acl-main.740"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: A pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Xu, H., Hu, S., Zhang, H., Li, J., Cao, R., Xu, Y., and Sun, X. (2020, January 8\u201313). CLUE: A Chinese Language Understanding Evaluation Benchmark. Proceedings of the International Conference on Computational Linguistics (COLING), Barcelona, Spain.","DOI":"10.18653\/v1\/2020.coling-main.419"},{"key":"ref_29","unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). DistilBERT: A distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Sun, Z., Yu, H., Song, X., Liu, R., Yang, Y., Zhou, D., and Zhou, J. (2020, January 9\u201311). MobileBERT: A Compact Task-Agnostic BERT for Resource-Limited Devices. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Seattle, WA, USA.","DOI":"10.18653\/v1\/2020.acl-main.195"},{"key":"ref_31","unstructured":"Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., and Zhou, M. (2020, January 6\u201312). MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_32","unstructured":"Michel, P., Levy, O., and Neubig, G. (2019, January 8\u201314). Are sixteen heads really better than one?. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1145\/3381831","article-title":"Green AI","volume":"63","author":"Schwartz","year":"2020","journal-title":"Commun. ACM"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Sun, C., Qiu, X., Xu, Y., and Huang, X. (2019, January 18\u201320). How to fine-tune BERT for text classification?. Proceedings of the China National Conference on Computational Linguistics (CCL), Kunming, China.","DOI":"10.1007\/978-3-030-32381-3_16"},{"key":"ref_35","unstructured":"Strubell, E., Ganesh, A., and McCallum, A. (August, January 28). Energy and policy considerations for deep learning in NLP. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Austin, TX, USA."},{"key":"ref_36","unstructured":"Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.M., Rothchild, D., and Dean, J. (2021). Carbon emissions and large neural network training. arXiv."},{"key":"ref_37","first-page":"544","article-title":"Toward sustainable AI: Energy-aware deployment and efficient model design","volume":"3","author":"Wu","year":"2022","journal-title":"IEEE Trans. Artif. Intell."},{"key":"ref_38","unstructured":"Huang, J., Cheng, J., and He, J. (2019, January 16\u201321). Speed\/accuracy trade-offs for modern convolutional object detectors. Proceedings of the Annual IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA."},{"key":"ref_39","unstructured":"Tang, R., Lu, Y., Liu, L., Mou, L., Vechtomova, O., and Lin, J. (2019). Distilling tasks from task-specific BERT models. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3439726","article-title":"Deep learning\u2013based text classification: A comprehensive review","volume":"54","author":"Minaee","year":"2021","journal-title":"ACM Comput. Surv."},{"key":"ref_41","first-page":"604","article-title":"A survey of deep learning applications in natural language processing","volume":"2","author":"Zhang","year":"2021","journal-title":"AI Open"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/11\/1812\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T12:10:39Z","timestamp":1761653439000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/11\/1812"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,27]]},"references-count":41,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["sym17111812"],"URL":"https:\/\/doi.org\/10.3390\/sym17111812","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,27]]}}}