{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T21:26:32Z","timestamp":1772313992680,"version":"3.50.1"},"reference-count":32,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,11,5]],"date-time":"2024-11-05T00:00:00Z","timestamp":1730764800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["62066009"],"award-info":[{"award-number":["62066009"]}]},{"name":"National Natural Science Foundation of China","award":["2020010308"],"award-info":[{"award-number":["2020010308"]}]},{"name":"National Natural Science Foundation of China","award":["Gui Ke AB22080047"],"award-info":[{"award-number":["Gui Ke AB22080047"]}]},{"name":"National Natural Science Foundation of China","award":["2022KY0799"],"award-info":[{"award-number":["2022KY0799"]}]},{"name":"National Natural Science Foundation of China","award":["2023KY0814"],"award-info":[{"award-number":["2023KY0814"]}]},{"name":"National Natural Science Foundation of China","award":["XJ20KT17"],"award-info":[{"award-number":["XJ20KT17"]}]},{"name":"Key Research and Development Project of Guilin","award":["62066009"],"award-info":[{"award-number":["62066009"]}]},{"name":"Key Research and Development Project of Guilin","award":["2020010308"],"award-info":[{"award-number":["2020010308"]}]},{"name":"Key Research and Development Project of Guilin","award":["Gui Ke AB22080047"],"award-info":[{"award-number":["Gui Ke AB22080047"]}]},{"name":"Key Research and Development Project of Guilin","award":["2022KY0799"],"award-info":[{"award-number":["2022KY0799"]}]},{"name":"Key Research and Development Project of Guilin","award":["2023KY0814"],"award-info":[{"award-number":["2023KY0814"]}]},{"name":"Key Research and Development Project of Guilin","award":["XJ20KT17"],"award-info":[{"award-number":["XJ20KT17"]}]},{"name":"Guangxi Key Research and Development Project","award":["62066009"],"award-info":[{"award-number":["62066009"]}]},{"name":"Guangxi Key Research and Development Project","award":["2020010308"],"award-info":[{"award-number":["2020010308"]}]},{"name":"Guangxi Key Research and Development Project","award":["Gui Ke AB22080047"],"award-info":[{"award-number":["Gui Ke AB22080047"]}]},{"name":"Guangxi Key Research and Development Project","award":["2022KY0799"],"award-info":[{"award-number":["2022KY0799"]}]},{"name":"Guangxi Key Research and Development Project","award":["2023KY0814"],"award-info":[{"award-number":["2023KY0814"]}]},{"name":"Guangxi Key Research and Development Project","award":["XJ20KT17"],"award-info":[{"award-number":["XJ20KT17"]}]},{"name":"The Project for Enhancing Young and Middle-aged Teacher\u2019s Research Basis Ability in Colleges of Guangxi","award":["62066009"],"award-info":[{"award-number":["62066009"]}]},{"name":"The Project for Enhancing Young and Middle-aged Teacher\u2019s Research Basis Ability in Colleges of Guangxi","award":["2020010308"],"award-info":[{"award-number":["2020010308"]}]},{"name":"The Project for Enhancing Young and Middle-aged Teacher\u2019s Research Basis Ability in Colleges of Guangxi","award":["Gui Ke AB22080047"],"award-info":[{"award-number":["Gui Ke AB22080047"]}]},{"name":"The Project for Enhancing Young and Middle-aged Teacher\u2019s Research Basis Ability in Colleges of Guangxi","award":["2022KY0799"],"award-info":[{"award-number":["2022KY0799"]}]},{"name":"The Project for Enhancing Young and Middle-aged Teacher\u2019s Research Basis Ability in Colleges of Guangxi","award":["2023KY0814"],"award-info":[{"award-number":["2023KY0814"]}]},{"name":"The Project for Enhancing Young and Middle-aged Teacher\u2019s Research Basis Ability in Colleges of Guangxi","award":["XJ20KT17"],"award-info":[{"award-number":["XJ20KT17"]}]},{"name":"The Fund of Guilin University of Aerospace Technology","award":["62066009"],"award-info":[{"award-number":["62066009"]}]},{"name":"The Fund of Guilin University of Aerospace Technology","award":["2020010308"],"award-info":[{"award-number":["2020010308"]}]},{"name":"The Fund of Guilin University of Aerospace Technology","award":["Gui Ke AB22080047"],"award-info":[{"award-number":["Gui Ke AB22080047"]}]},{"name":"The Fund of Guilin University of Aerospace Technology","award":["2022KY0799"],"award-info":[{"award-number":["2022KY0799"]}]},{"name":"The Fund of Guilin University of Aerospace Technology","award":["2023KY0814"],"award-info":[{"award-number":["2023KY0814"]}]},{"name":"The Fund of Guilin University of Aerospace Technology","award":["XJ20KT17"],"award-info":[{"award-number":["XJ20KT17"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>The accuracy of traditional topic models may be compromised due to the sparsity of co-occurring vocabulary in the corpus, whereas conventional word embedding models tend to excessively prioritize contextual semantic information and inadequately capture domain-specific features in the text. This paper proposes a hybrid semantic representation method that combines a topic model that integrates conceptual knowledge with a weighted word embedding model. Specifically, we construct a topic model incorporating the Probase concept knowledge base to perform topic clustering and obtain topic semantic representation. Additionally, we design a weighted word embedding model to enhance the contextual semantic information representation of the text. The feature-based information fusion model is employed to integrate the two textual representations and generate a hybrid semantic representation. The hybrid semantic representation model proposed in this study was evaluated based on various English composition test sets. The findings demonstrate that the model presented in this paper exhibits superior accuracy and practical value compared to existing text representation methods.<\/jats:p>","DOI":"10.3390\/info15110708","type":"journal-article","created":{"date-parts":[[2024,11,5]],"date-time":"2024-11-05T08:25:14Z","timestamp":1730795114000},"page":"708","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts"],"prefix":"10.3390","volume":"15","author":[{"given":"Zan","family":"Qiu","sequence":"first","affiliation":[{"name":"Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"},{"name":"School of Computer Science and Engineering, Guilin University of Aerospace Technology, Guilin 541004, China"}]},{"given":"Guimin","family":"Huang","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}]},{"given":"Xingguo","family":"Qin","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}]},{"given":"Yabing","family":"Wang","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}]},{"given":"Jiahao","family":"Wang","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}]},{"given":"Ya","family":"Zhou","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,11,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Babi\u0107, K., Martin\u010di\u0107-Ip\u0161i\u0107, S., and Me\u0161trovi\u0107, A. (2020). Survey of Neural Text Representation Models. Information, 11.","DOI":"10.3390\/info11110511"},{"key":"ref_2","first-page":"40","article-title":"Deep Learning\u2014Based Text Classification","volume":"54","author":"Minaee","year":"2021","journal-title":"ACM Comput. Surv."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1109\/TNNLS.2020.2979670","article-title":"A Survey of the Usages of Deep Learning for Natural Language Processing","volume":"32","author":"Otter","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"36120","DOI":"10.1109\/ACCESS.2023.3266377","article-title":"A Survey of Text Representation and Embedding Techniques in NLP","volume":"11","author":"Patil","year":"2023","journal-title":"IEEE Access"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"794","DOI":"10.1109\/TFUZZ.2017.2690222","article-title":"Fuzzy Bag-of-Words Model for Document Representation","volume":"26","author":"Zhao","year":"2018","journal-title":"IEEE Trans. Fuzzy Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1007\/s00607-019-00755-y","article-title":"Study on Text Representation Method Based on Deep Learning and Topic Information","volume":"102","author":"Jiang","year":"2019","journal-title":"Computing"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2928","DOI":"10.1109\/TKDE.2014.2313872","article-title":"BTM: Topic Modeling over Short Texts","volume":"26","author":"Cheng","year":"2014","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1667053.1667056","article-title":"The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies","volume":"57","author":"Blei","year":"2010","journal-title":"J. ACM"},{"key":"ref_9","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013, January 5\u20138). Distributed Representations of Words and Phrases and Their Compositionality. Proceedings of the 27th Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, NV, USA."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C.D. (2014, January 25\u201329). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wu, W., Li, H., Wang, H., and Zhu, K.Q. (2012, January 20\u201324). Probase: A Probabilistic Taxonomy for Text Understanding. Proceedings of the ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, USA.","DOI":"10.1145\/2213836.2213891"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/j.neucom.2019.08.080","article-title":"Incorporating Context-Relevant Concepts into Convolutional Neural Networks for Short Text Classification","volume":"386","author":"Xu","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3462478","article-title":"Topic Modeling Using Latent Dirichlet Allocation: A Survey","volume":"54","author":"Chauhan","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Tian, H., and Wu, L. (2018, January 23\u201325). Microblog Emotional Analysis Based on TF-IWF Weighted Word2vec Model. Proceedings of the IEEE 9th International Conference on Software Engineering and Service Science, Beijing, China.","DOI":"10.1109\/ICSESS.2018.8663837"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Xun, G., Li, Y., Gao, J., and Zhang, A. (2017, January 13\u201317). Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada.","DOI":"10.1145\/3097983.3098009"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"118442","DOI":"10.1016\/j.eswa.2022.118442","article-title":"DeepSumm: Exploiting Topic Models and Sequence to Sequence Networks for Extractive Text Summarization","volume":"211","author":"Joshi","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_17","first-page":"993","article-title":"Latent Dirichlet Allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Karras, C., Karras, A., Tsolis, D., Giotopoulos, K., and Sioutas, S. (2022, January 23\u201325). Distributed Gibbs Sampling and LDA Modelling for Large Scale Big Data Management on PySpark. Proceedings of the South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference, Ioannina, Greece.","DOI":"10.1109\/SEEDA-CECNSM57760.2022.9932990"},{"key":"ref_19","first-page":"827","article-title":"Research on Subject Pattern Based on Deep Learning","volume":"43","author":"Huang","year":"2020","journal-title":"J. Comput. Sci."},{"key":"ref_20","first-page":"14331","article-title":"Knowledge-Aware Bayesian Deep Topic Model","volume":"35","author":"Wang","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Andrzejewski, D., Zhu, X., and Craven, M. (2009, January 14\u201318). Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors. Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada.","DOI":"10.1145\/1553374.1553378"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"11407","DOI":"10.1007\/s00500-019-04604-0","article-title":"A Novel Topic Model for Documents by Incorporating Semantic Relations between Words","volume":"24","author":"Chen","year":"2019","journal-title":"Soft Comput."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"103215","DOI":"10.1016\/j.ipm.2022.103215","article-title":"Graph Neural Topic Model with Commonsense Knowledge","volume":"60","author":"Zhu","year":"2023","journal-title":"Inf. Process. Manag."},{"key":"ref_24","unstructured":"Liang, Y., Zhang, Y., Wei, B., Jin, Z., Zhang, R., Zhang, Y., and Chen, Q. (2017, January 4\u20139). Incorporating Knowledge Graph Embeddings into Topic Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Shi, T., Kang, K., Choo, J., and Reddy, C.K. (2018, January 23\u201327). Short-Text Topic Modeling via Non-Negative Matrix Factorization Enriched with Local Word-Context Correlations. Proceedings of the World Wide Web Conference, Lyon, France.","DOI":"10.1145\/3178876.3186009"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"114231","DOI":"10.1016\/j.eswa.2020.114231","article-title":"A New Topic Modeling Based Approach for Aspect Extraction in Aspect Based Sentiment Analysis: SS-LDA","volume":"168","author":"Ozyurt","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"106411","DOI":"10.1016\/j.infsof.2020.106411","article-title":"A Systematic Comparison of Search-Based Approaches for LDA Hyperparameter Tuning","volume":"130","author":"Panichella","year":"2021","journal-title":"Inf. Softw. Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.knosys.2018.08.011","article-title":"Experimental Explorations on Short Text Topic Mining between LDA and NMF Based Schemes","volume":"163","author":"Chen","year":"2019","journal-title":"Knowl. Based Syst."},{"key":"ref_29","unstructured":"Devlin, J., Chang, M., and Lee, K. (2018). Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Peinelt, N., Nguyen, D., and Liakata, M. (2020, January 5\u201310). TBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online.","DOI":"10.18653\/v1\/2020.acl-main.630"},{"key":"ref_31","first-page":"4582480","article-title":"A Topic Recognition Method of News Text Based on Word Embedding Enhancement","volume":"2022","author":"Du","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., and McClosky, D. (2014, January 23\u201324). The Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MD, USA.","DOI":"10.3115\/v1\/P14-5010"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/11\/708\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:26:42Z","timestamp":1760113602000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/11\/708"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,5]]},"references-count":32,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,11]]}},"alternative-id":["info15110708"],"URL":"https:\/\/doi.org\/10.3390\/info15110708","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,5]]}}}