{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T02:05:00Z","timestamp":1769911500562,"version":"3.49.0"},"reference-count":31,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,12,13]],"date-time":"2023-12-13T00:00:00Z","timestamp":1702425600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Text classification is the task of estimating the genre of a document based on information such as word co-occurrence and frequency of occurrence. Text classification has been studied by various approaches. In this study, we focused on text classification using graph structure data. Conventional graph-based methods express relationships between words and relationships between words and documents as weights between nodes. Then, a graph neural network is used for learning. However, there is a problem that conventional methods are not able to represent the relationship between documents on the graph. In this paper, we propose a graph structure that considers the relationships between documents. In the proposed method, the cosine similarity of document vectors is set as weights between document nodes. This completes a graph that considers the relationship between documents. The graph is then input into a graph convolutional neural network for training. Therefore, the aim of this study is to improve the text classification performance of conventional methods by using this graph that considers the relationships between document nodes. In this study, we conducted evaluation experiments using five different corpora of English documents. The results showed that the proposed method outperformed the performance of the conventional method by up to 1.19%, indicating that the use of relationships between documents is effective. In addition, the proposed method was shown to be particularly effective in classifying long documents.<\/jats:p>","DOI":"10.3390\/bdcc7040181","type":"journal-article","created":{"date-parts":[[2023,12,13]],"date-time":"2023-12-13T04:14:42Z","timestamp":1702440882000},"page":"181","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Text Classification Based on the Heterogeneous Graph Considering the Relationships between Documents"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-6575-7859","authenticated-orcid":false,"given":"Hiromu","family":"Nakajima","sequence":"first","affiliation":[{"name":"School of Science and Engineering, Ibaraki University, Hitachi 316-8511, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8101-2796","authenticated-orcid":false,"given":"Minoru","family":"Sasaki","sequence":"additional","affiliation":[{"name":"Department of Computer and Information Sciences, Faculty of Engineering, Ibaraki University, Hitachi 316-8511, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,13]]},"reference":[{"key":"ref_1","unstructured":"Kipf, T.N., and Welling, M. (2017). Semi-supervised classification with graph convolutional networks. arXiv."},{"key":"ref_2","unstructured":"Yao, L., Mao, C., and Luo, Y. (February, January 27). Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lu, Z., Du, P., and Nie, J.Y. (2020, January 14\u201317). Vgcn-bert: Augmenting bert with graph embedding for text classification. Proceedings of the European Conference on Information Retrieval, Lisbon, Portugal.","DOI":"10.1007\/978-3-030-45439-5_25"},{"key":"ref_4","unstructured":"Lin, Y., Meng, Y., Sun, X., Han, Q., Kuang, K., Li, J., and Wu, F. (2021). Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Association for Computational Linguistics."},{"key":"ref_5","unstructured":"Nakajima, H., and Sasaki, M. (2022, January 20\u201322). Text Classification Using a Graph Based on Relationships Between Documents. Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, Manila, Philippines. Available online: https:\/\/aclanthology.org\/2022.paclic-1.14.pdf."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s13042-010-0001-0","article-title":"Understanding bag-of-words model: A statistical framework","volume":"1","author":"Zhang","year":"2010","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_7","unstructured":"Cavnar, W.B., and Trenkle, J.M. (1994, January 1). N-gram-based text categorization. In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, NV, USA."},{"key":"ref_8","unstructured":"Baeza-Yates, R., and Ribeiro-Neto, B. (1999). Modern Information Retrieval, ACM Press."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"843","DOI":"10.1016\/j.is.2011.02.002","article-title":"Word co-occurrence features for text classification","volume":"36","author":"Figueiredo","year":"2011","journal-title":"Inf. Syst."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1145\/321075.321084","article-title":"Automatic indexing: An experimental inquiry","volume":"8","author":"Maron","year":"1961","journal-title":"J. ACM"},{"key":"ref_11","unstructured":"Ng, A.Y., and Jordan, M.I. (2001, January 3\u20138). On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, Vancouver, BC, Canada."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","article-title":"Nearest neighbor pattern classification","volume":"13","author":"Cover","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Joachims, T. (1998, January 21\u201323). Text categorization with support vector machines: Learning with many relevant features. Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany. Available online: https:\/\/link.springer.com\/chapter\/10.1007\/BFb0026683.","DOI":"10.1007\/BFb0026683"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"ref_15","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C.D. (2014, January 25\u201329). GloVe: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Joulin, A., Grave, E., Bojanowski, P., and Mikolov, T. (2016). Bag of Tricks for Efficient Text Classification. arXiv.","DOI":"10.18653\/v1\/E17-2068"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. (2018, January 1\u20136). Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LA, USA. Available online: http:\/\/aclweb.org\/anthology\/N18-1202.","DOI":"10.18653\/v1\/N18-1202"},{"key":"ref_20","unstructured":"Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA."},{"key":"ref_21","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wang, G., Li, C., Wang, W., Zhang, Y., Shen, D., Zhang, X., Henao, R., and Carin, L. (2018, January 15\u201320). Joint Embedding of Words and Labels for Text Classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia. Available online: https:\/\/aclanthology.org\/P18-1216.","DOI":"10.18653\/v1\/P18-1216"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Shen, D., Wang, G., Wang, W., Min, M.R., Su, Q., Zhang, Y., Li, C., Henao, R., and Carin, L. (2018, January 15\u201320). Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia. Available online: http:\/\/aclweb.org\/anthology\/P18-1041.","DOI":"10.18653\/v1\/P18-1041"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Iyyer, M., Manjunatha, V., Boyd-Graber, J., and Daum\u00e9, H. (2015, January 27\u201331). Deep Unordered Composition Rivals Syntactic Methods for Text Classification. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China.","DOI":"10.3115\/v1\/P15-1162"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Kalchbrenner, N., Grefenstette, E., and Blunsom, P. (2014, January 22\u201327). A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA.","DOI":"10.3115\/v1\/P14-1062"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Tai, K.S., Socher, R., and Manning, C.D. (2015, January 27\u201331). Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China.","DOI":"10.3115\/v1\/P15-1150"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","article-title":"The graph neural network model","volume":"20","author":"Scarselli","year":"2008","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_28","unstructured":"Wu, F., Zhang, T., Souza, A.H., Fifty, C., Yu, T., and Weinberger, K.Q. (2019, January 10\u201315). Simplifying Graph Convolutional Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA. Available online: https:\/\/arxiv.org\/abs\/1902.07153."},{"key":"ref_29","unstructured":"Kipf, T.N., and Welling, M. (2016). Variational graph auto-encoders. arXiv."},{"key":"ref_30","unstructured":"Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., and Bengio, Y. (2017). Graph attention networks. arXiv."},{"key":"ref_31","unstructured":"Witten, I.H., Moffat, A., and Bell, T.C. (1999). Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/7\/4\/181\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:37:54Z","timestamp":1760132274000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/7\/4\/181"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,13]]},"references-count":31,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["bdcc7040181"],"URL":"https:\/\/doi.org\/10.3390\/bdcc7040181","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,13]]}}}