{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T04:24:33Z","timestamp":1774499073990,"version":"3.50.1"},"reference-count":47,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T00:00:00Z","timestamp":1751932800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["42201505"],"award-info":[{"award-number":["42201505"]}]},{"name":"National Natural Science Foundation of China","award":["622QN352"],"award-info":[{"award-number":["622QN352"]}]},{"name":"National Natural Science Foundation of China","award":["2021YFF070420304"],"award-info":[{"award-number":["2021YFF070420304"]}]},{"name":"National Natural Science Foundation of China","award":["2025000010"],"award-info":[{"award-number":["2025000010"]}]},{"name":"Natural Science Foundation of Hainan Province of China","award":["42201505"],"award-info":[{"award-number":["42201505"]}]},{"name":"Natural Science Foundation of Hainan Province of China","award":["622QN352"],"award-info":[{"award-number":["622QN352"]}]},{"name":"Natural Science Foundation of Hainan Province of China","award":["2021YFF070420304"],"award-info":[{"award-number":["2021YFF070420304"]}]},{"name":"Natural Science Foundation of Hainan Province of China","award":["2025000010"],"award-info":[{"award-number":["2025000010"]}]},{"name":"National Key Research and Development Program of China","award":["42201505"],"award-info":[{"award-number":["42201505"]}]},{"name":"National Key Research and Development Program of China","award":["622QN352"],"award-info":[{"award-number":["622QN352"]}]},{"name":"National Key Research and Development Program of China","award":["2021YFF070420304"],"award-info":[{"award-number":["2021YFF070420304"]}]},{"name":"National Key Research and Development Program of China","award":["2025000010"],"award-info":[{"award-number":["2025000010"]}]},{"name":"Computer Network and Information Special Project Of Chinese Academy of Sciences","award":["42201505"],"award-info":[{"award-number":["42201505"]}]},{"name":"Computer Network and Information Special Project Of Chinese Academy of Sciences","award":["622QN352"],"award-info":[{"award-number":["622QN352"]}]},{"name":"Computer Network and Information Special Project Of Chinese Academy of Sciences","award":["2021YFF070420304"],"award-info":[{"award-number":["2021YFF070420304"]}]},{"name":"Computer Network and Information Special Project Of Chinese Academy of Sciences","award":["2025000010"],"award-info":[{"award-number":["2025000010"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>Earth observation data serve as a fundamental resource in Earth system science. The rapid advancement of remote sensing and in situ measurement technologies has led to the generation of massive volumes of data, accompanied by a growing body of geographic textual information. Efficient and accurate classification and management of these geographic texts has become a critical challenge in the field. However, the effectiveness of traditional classification approaches is hindered by several issues, including data sparsity, class imbalance, semantic ambiguity, and the prevalence of domain-specific terminology. To address these limitations and enable the intelligent management of geographic information, this study proposes an efficient geographic text classification framework based on large language models (LLMs), tailored to the unique semantic and structural characteristics of geographic data. Specifically, LLM-based data augmentation strategies are employed to mitigate the scarcity of labeled data and class imbalance. A semantic vector database is utilized to filter the label space prior to inference, enhancing the model\u2019s adaptability to diverse geographic terms. Furthermore, few-shot prompt learning guides LLMs in understanding domain-specific language, while an output alignment mechanism improves classification stability for complex descriptions. This approach offers a scalable solution for the automated semantic classification of geographic text for unlocking the potential of ever-expanding geospatial big data, thereby advancing intelligent information processing and knowledge discovery in the geospatial domain.<\/jats:p>","DOI":"10.3390\/ijgi14070268","type":"journal-article","created":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T11:58:07Z","timestamp":1751975887000},"page":"268","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["HierLabelNet: A Two-Stage LLMs Framework with Data Augmentation and Label Selection for Geographic Text Classification"],"prefix":"10.3390","volume":"14","author":[{"given":"Zugang","family":"Chen","sequence":"first","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"College of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-0815-798X","authenticated-orcid":false,"given":"Le","family":"Zhao","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"College of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,7,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"40","DOI":"10.5334\/dsj-2019-040","article-title":"NASA\u2019s Earth Observing Data and Information System\u2014Near-Term Challenges","volume":"18","author":"Behnke","year":"2019","journal-title":"Data Sci. J."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1016\/j.isprsjprs.2022.12.006","article-title":"Recent Advances in Using Chinese Earth Observation Satellites for Remote Sensing of Vegetation","volume":"195","author":"Zhang","year":"2023","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1109\/LCOMM.2022.3151657","article-title":"Collaborative Data Offloading for Earth Observation Satellite Networks","volume":"26","author":"He","year":"2022","journal-title":"IEEE Commun. Lett."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1080\/20964471.2022.2107420","article-title":"Publishing China Satellite Data on the GEOSS Platform","volume":"7","author":"Roncella","year":"2023","journal-title":"Big Earth Data"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Ochiai, O., Harada, M., and Hamamoto, K. (2022, January 17\u201320). Earth Observation Data Utilization for SDGs Indicators: 15.4.2 and 11.3.1. Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan.","DOI":"10.1109\/BigData55660.2022.10020760"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Rinaldi, M., Ruggieri, S., Ciavarella, F., De Santis, A.P., Palmisano, D., Balenzano, A., Mattia, F., and Satalino, G. (2023, January 16\u201321). How Can Be Used Earth Observation Data in Conservation Agriculture Monitoring?. Proceedings of the IGARSS 2023\u20142023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA.","DOI":"10.1109\/IGARSS52108.2023.10282377"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Kavvada, A. (2021, January 11\u201316). Knowledge Generation Using Earth Observations to Support Sustainable Development. Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium.","DOI":"10.1109\/IGARSS47720.2021.9553801"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Caon, M., Ros, P.M., Martina, M., Bianchi, T., Magli, E., Membibre, F., Ramos, A., Latorre, A., Kerr, M., and Wiehle, S. (2021, January 11\u201316). Very Low Latency Architecture for Earth Observation Satellite Onboard Data Handling, Compression, and Encryption. Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium.","DOI":"10.1109\/IGARSS47720.2021.9554085"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1109\/JSTARS.2020.3038152","article-title":"Semantically Enriched Crop Type Classification and Linked Earth Observation Data to Support the Common Agricultural Policy Monitoring","volume":"14","author":"Rousi","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., and Gao, J. (2021). Deep Learning Based Text Classification: A Comprehensive Review. arXiv.","DOI":"10.1145\/3439726"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Boukhers, Z., Khan, A., Ramadan, Q., and Yang, C. (2024, January 3\u20136). Large Language Model in Medical Informatics: Direct Classification and Enhanced Text Representations for Automatic ICD Coding. Proceedings of the 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Lisbon, Portugal.","DOI":"10.1109\/BIBM62325.2024.10822419"},{"key":"ref_12","unstructured":"Beliveau, V., Kaas, H., Prener, M., Ladefoged, C.N., Elliott, D., Knudsen, G.M., Pinborg, L.H., and Ganz, M. (2024). Classification of Radiological Text in Small and Imbalanced Datasets in a Non-English Language. arXiv."},{"key":"ref_13","unstructured":"Wang, J., Zhao, Z., Wang, Z.J., Cheng, B.D., Nie, L., Luo, W., Yu, Z.Y., and Yuan, L.W. (2025). GeoRAG: A Question-Answering Approach from a Geographical Perspective. arXiv."},{"key":"ref_14","first-page":"86","article-title":"Evaluation of Geographical Distortions in Language Models: A Crucial Step towards Equitable Representations","volume":"Volume 15243","author":"Decoupes","year":"2025","journal-title":"Discovery Science, Proceedings of the 27th International Conference, DS 2024, Pisa, Italy, 14\u201316 October 2024"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chen, P., Xu, H., Zhang, C., and Huang, R. (2022, January 10\u201315). Crossroads, Buildings and Neighborhoods: A Dataset for Fine-Grained Location Recognition. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, WA, USA.","DOI":"10.18653\/v1\/2022.naacl-main.243"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1080\/13658816.2018.1458986","article-title":"A Natural Language Processing and Geospatial Clustering Framework for Harvesting Local Place Names from Geotagged Housing Advertisements","volume":"33","author":"Hu","year":"2019","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"979","DOI":"10.1007\/s12145-022-00775-x","article-title":"Few-Shot Learning for Name Entity Recognition in Geological Text Based on GeoBERT","volume":"15","author":"Liu","year":"2022","journal-title":"Earth Sci. Inform."},{"key":"ref_18","unstructured":"Vajjala, S., and Shimangaud, S. (2025). Text Classification in the LLM Era\u2014Where Do We Stand?. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kong, E., Zhang, J., Yu, D., and Shen, M. (2024, January 12\u201314). Chinese Short Text Classification Method Based on Enhanced Prompt Learning. Proceedings of the 2024 7th International Conference on Computer Information Science and Application Technology (CISAT), Hangzhou, China.","DOI":"10.1109\/CISAT62382.2024.10695255"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sun, Z., Harit, A., Cristea, A.I., Yu, J., Shi, L., and Al Moubayed, N. (2022, January 18\u201323). Contrastive Learning with Heterogeneous Graph Attention Networks on Short Text Classification. Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy.","DOI":"10.1109\/IJCNN55064.2022.9892257"},{"key":"ref_21","first-page":"6252","article-title":"Deep Short Text Classification with Knowledge Powered Attention","volume":"33","author":"Chen","year":"2019","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"105462","DOI":"10.1016\/j.cageo.2023.105462","article-title":"An Ontology-Based Framework for Semantic Geographic Information Systems Development and Understanding","volume":"181","author":"Kuo","year":"2023","journal-title":"Comput. Geosci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1214","DOI":"10.1162\/tacl_a_00697","article-title":"Retrieval-Style In-Context Learning for Few-Shot Hierarchical Text Classification","volume":"12","author":"Chen","year":"2024","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Li, Z., Zhu, H., Lu, Z., and Yin, M. (2023, January 6\u201310). Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore.","DOI":"10.18653\/v1\/2023.emnlp-main.647"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Liu, P., Wang, X., Xiang, C., and Meng, W. (2020, January 21\u201323). A Survey of Text Data Augmentation. Proceedings of the 2020 International Conference on Computer Communication and Network Security (CCNS), Xi\u2019an, China.","DOI":"10.1109\/CCNS50731.2020.00049"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Wei, J., and Zou, K. (2019, January 3\u20137). EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China.","DOI":"10.18653\/v1\/D19-1670"},{"key":"ref_27","first-page":"146","article-title":"A Survey on Data Augmentation for Text Classification","volume":"55","author":"Bayer","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Tan, Z., Li, D., Wang, S., Beigi, A., Jiang, B., Bhattacharjee, A., Karami, M., Li, J., Cheng, L., and Liu, H. (2024, January 12\u201316). Large Language Models for Data Annotation and Synthesis: A Survey. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA.","DOI":"10.18653\/v1\/2024.emnlp-main.54"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Long, L., Wang, R., Xiao, R., Zhao, J., Ding, X., Chen, G., and Wang, H. (2024). On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey. Findings of the Association for Computational Linguistics, Proceedings of the ACL 2024, Bangkok, Thailand, 11\u201316 August 2024, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2024.findings-acl.658"},{"key":"ref_30","unstructured":"Guo, X., and Chen, Y. (2024). Generative AI for Synthetic Data Generation: Methods, Challenges and the Future. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Choi, J., Kim, Y., Yu, S., Yun, J., and Kim, Y. (2024, January 12\u201316). UniGen: Universal Domain Generalization for Sentiment Classification via Zero-Shot Dataset Generation. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA.","DOI":"10.18653\/v1\/2024.emnlp-main.1"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Tao, C., Fan, X., and Yang, Y. (2024, January 20\u201322). Harnessing LLMs for API Interactions: A Framework for Classification and Synthetic Data Generation. Proceedings of the 2024 5th International Conference on Computers and Artificial Intelligence Technology (CAIT), Hangzhou, China.","DOI":"10.1109\/CAIT64506.2024.10962957"},{"key":"ref_33","unstructured":"Patwa, P., Filice, S., Chen, Z., Castellucci, G., Rokhlenko, O., and Malmasi, S. (2024, January 20\u201325). Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italia."},{"key":"ref_34","unstructured":"Ubani, S., Polat, S.O., and Nielsen, R. (2023). ZeroShotDataAug: Generating and Augmenting Training Data with ChatGPT. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1109\/TBDATA.2025.3536934","article-title":"AugGPT: Leveraging ChatGPT for Text Data Augmentation","volume":"11","author":"Dai","year":"2025","journal-title":"IEEE Trans. Big Data"},{"key":"ref_36","unstructured":"Yehudai, A., Carmeli, B., Mass, Y., Arviv, O., Mills, N., Shnarch, E., and Choshen, L. (2024, January 7\u201311). Achieving Human Parity in Content-Grounded Datasets Generation. Proceedings of the Twelfth International Conference on Learning Representations (ICLR), Vienna, Austria."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Li, Z., Chen, W., Li, S., Wang, H., Qian, J., and Yan, X. (2022). Controllable Dialogue Simulation with In-Context Learning. Findings of the Association for Computational Linguistics, Proceedings of the EMNLP 2022, Abu Dhabi, United Arab Emirates, 7\u201311 December 2022, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2022.findings-emnlp.318"},{"key":"ref_38","first-page":"4171","article-title":"BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding","volume":"Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019","journal-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies"},{"key":"ref_39","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Gretz, S., Halfon, A., Shnayderman, I., Toledo-Ronen, O., Spector, A., Dankin, L., Katsis, Y., Arviv, O., Katz, Y., and Slonim, N. (2023). Zero-Shot Topical Text Classification with LLMs\u2014An Experimental Study. Findings of the Association for Computational Linguistics, Proceedings of the EMNLP 2023, Singapore, 6\u201310 December 2023, Association for Computational Linguistics.","DOI":"10.18653\/v1\/2023.findings-emnlp.647"},{"key":"ref_41","unstructured":"Tian, K., and Chen, H. (2024, January 20\u201325). ESG-GPT:GPT4-Based Few-Shot Prompt Learning for Multi-Lingual ESG News Text Classification. Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing, Torino, Italia."},{"key":"ref_42","first-page":"24696","article-title":"Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning","volume":"39","author":"Liu","year":"2025","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_43","unstructured":"Kostina, A., Dikaiakos, M.D., Stefanidis, D., and Pallis, G. (2025). Large Language Models For Text Classification: Case Study And Comprehensive Review. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Ahmadnia, S., Jordehi, A.Y., Heyran, M.H.K., Mirroshandel, S.A., Rambow, O., and Caragea, C. (2025). Active Few-Shot Learning for Text Classification. arXiv.","DOI":"10.18653\/v1\/2025.naacl-long.340"},{"key":"ref_45","unstructured":"Ku, L.-W., Martins, A., and Srikumar, V. (2024). Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models. Findings of the Association for Computational Linguistics, Proceedings of the ACL 2024, Bangkok, Thailand, 11\u201316 August 2024, Association for Computational Linguistics."},{"key":"ref_46","unstructured":"Yang, A., Yang, B., Zhang, B., Hui, B., Zheng, B., Yu, B., Li, C., Liu, D., Huang, F., and Wei, H. (2025). Qwen2.5 Technical Report. arXiv."},{"key":"ref_47","unstructured":"Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., and Vaughan, A. (2024). The Llama 3 Herd of Models. arXiv."}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/14\/7\/268\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:06:46Z","timestamp":1760033206000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/14\/7\/268"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,8]]},"references-count":47,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["ijgi14070268"],"URL":"https:\/\/doi.org\/10.3390\/ijgi14070268","relation":{},"ISSN":["2220-9964"],"issn-type":[{"value":"2220-9964","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,8]]}}}