{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T21:25:12Z","timestamp":1771017912827,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2023,9,27]],"date-time":"2023-09-27T00:00:00Z","timestamp":1695772800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Forest tree species identification in the field of remote sensing has become an important research topic. Currently, few research methods combine global and local features, making it challenging to accurately handle the similarity between different categories. Moreover, using a single deep layer for feature extraction overlooks the unique feature information at intermediate levels. This paper proposes a remote sensing image forest tree species classification method based on the Multi-Scale Convolution and Multi-Level Fusion Network (MCMFN) architecture. In the MCMFN network, the Shallow Multi-Scale Convolution Attention Combination (SMCAC) module replaces the original 7 \u00d7 7 convolution at the first layer of ResNet-50. This module uses multi-scale convolution to capture different receptive fields, and combines it with the attention mechanism to effectively enhance the ability of shallow features and obtain richer feature information. Additionally, to make efficient use of intermediate and deep-level feature information, the Multi-layer Selection Feature Fusion (MSFF) module is employed to improve classification accuracy. Experimental results on the Aerial forest dataset demonstrate a classification accuracy of 91.03%. The comprehensive experiments indicate the feasibility and effectiveness of the proposed MCMFN network.<\/jats:p>","DOI":"10.3390\/rs15194732","type":"journal-article","created":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T01:51:14Z","timestamp":1695865874000},"page":"4732","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["A Multi-Scale Convolution and Multi-Layer Fusion Network for Remote Sensing Forest Tree Species Recognition"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-7483-7616","authenticated-orcid":false,"given":"Jinjing","family":"Hou","sequence":"first","affiliation":[{"name":"School of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7915-8684","authenticated-orcid":false,"given":"Houkui","family":"Zhou","sequence":"additional","affiliation":[{"name":"Zhejiang Provincial Key Laboratory of Forestry Intelligent Monitoring and Information Technology, Zhejiang A&F University, Hangzhou 311300, China"}]},{"given":"Junguo","family":"Hu","sequence":"additional","affiliation":[{"name":"Zhejiang Provincial Key Laboratory of Forestry Intelligent Monitoring and Information Technology, Zhejiang A&F University, Hangzhou 311300, China"}]},{"given":"Huimin","family":"Yu","sequence":"additional","affiliation":[{"name":"College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China"},{"name":"State Key Laboratory of CAD & CG, Zhejiang University, Hangzhou 310027, China"}]},{"given":"Haoji","family":"Hu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of CAD & CG, Zhejiang University, Hangzhou 310027, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/0034-4257(80)90044-9","article-title":"Coniferous tree species mapping using LANDSAT data","volume":"9","author":"Walsh","year":"1980","journal-title":"Remote Sens. Environ."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"460","DOI":"10.1016\/S0034-4257(01)00324-8","article-title":"Phenological Differences in Tasseled Cap Indices Improve Deciduous Forest Classification","volume":"80","author":"Dymond","year":"2002","journal-title":"Remote Sens. Environ."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"6712","DOI":"10.1109\/TGRS.2018.2841823","article-title":"Exploring Hierarchical Convolutional Features for Hyperspectral Image Classification","volume":"56","author":"Cheng","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.rse.2014.03.018","article-title":"Urban Tree Species Mapping Using Hyperspectral and Lidar Data Fusion","volume":"148","author":"Alonzo","year":"2014","journal-title":"Remote Sens. Environ."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1879","DOI":"10.1007\/s11676-020-01245-0","article-title":"Tree Species Classification Using Deep Learning and RGB Optical Images Obtained by an Unmanned Aerial Vehicle","volume":"32","author":"Zhang","year":"2021","journal-title":"J. For. Res."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Li, H., Hu, B., Li, Q., and Jing, L. (2021). CNN-Based Individual Tree Species Classification Using High-Resolution Satellite Imagery and Airborne LiDAR Data. Forests, 12.","DOI":"10.3390\/f12121697"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet Classification with Deep Convolutional Neural Networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Wei, L., Yangqing, J., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going Deeper with Convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_9","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Dollar, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated Residual Transformations for Deep Neural Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely Connected Convolutional Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Chen, Y., Jin, X., Feng, J., and Yan, S. (2017). Training Group Orthogonal Neural Networks with Privileged Information. arXiv.","DOI":"10.24963\/ijcai.2017\/212"},{"key":"ref_14","unstructured":"Mehta, S., and Rastegari, M. (2022). Separable Self-Attention for Mobile Vision Transformers. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., and Pang, R. (November, January 27). Searching for MobileNetV3. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00140"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201323). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_17","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021). An Image Is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_20","first-page":"1","article-title":"Extended Vision Transformer (ExViT) for Land Use and Land Cover Classification: A Multimodal Deep Learning Framework","volume":"61","author":"Yao","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hong, D., Gao, L., Wu, X., Yao, J., and Zhang, B. (2021, January 24). Revisiting Graph Convolutional Networks with Mini-Batch Sampling for Hyperspectral Image Classification. Proceedings of the 2021 11th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, Netherlands.","DOI":"10.1109\/WHISPERS52202.2021.9484014"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2023.3313876","article-title":"LRR-Net: An Interpretable Deep Unfolding Network for Hyperspectral Anomaly Detection","volume":"61","author":"Li","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"109301","DOI":"10.1016\/j.measurement.2021.109301","article-title":"Tree Species Classification of LiDAR Data Based on 3D Deep Learning","volume":"177","author":"Liu","year":"2021","journal-title":"Measurement"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Nezami, S., Khoramshahi, E., Nevalainen, O., P\u00f6l\u00f6nen, I., and Honkavaara, E. (2020). Tree Species Classification of Drone Hyperspectral and RGB Imagery with Deep Learning Convolutional Neural Networks. Remote Sens., 12.","DOI":"10.20944\/preprints202002.0334.v1"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"7013","DOI":"10.1109\/JSTARS.2022.3199618","article-title":"Dual-Concentrated Network With Morphological Features for Tree Species Classification Using Hyperspectral Image","volume":"15","author":"Guo","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wang, G., Cheng, L., Lin, J., Dai, Y., and Zhang, T. (2021). Fine-Grained Classification Based on Multi-Scale Pyramid Convolution Networks. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0254054"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Pan, X., Ge, C., Lu, R., Song, S., Chen, G., Huang, Z., and Huang, G. (2022, January 18\u201324). On the Integration of Self-Attention and Convolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00089"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"681","DOI":"10.5194\/essd-15-681-2023","article-title":"TreeSatAI Benchmark Archive: A Multi-Sensor, Multi-Label Dataset for Tree Species Classification in Remote Sensing; ESSD\u2014Land\/Land Cover and Land Use","volume":"15","author":"Ahlswede","year":"2022","journal-title":"Earth Syst. Sci. Data Discuss."},{"key":"ref_30","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/19\/4732\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:59:55Z","timestamp":1760129995000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/19\/4732"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,27]]},"references-count":30,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["rs15194732"],"URL":"https:\/\/doi.org\/10.3390\/rs15194732","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,27]]}}}