{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,14]],"date-time":"2026-01-14T18:54:52Z","timestamp":1768416892224,"version":"3.49.0"},"reference-count":40,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2019,3,6]],"date-time":"2019-03-06T00:00:00Z","timestamp":1551830400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61725105 and 41801349"],"award-info":[{"award-number":["61725105 and 41801349"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Aircraft recognition in remote sensing images has long been a meaningful topic. Most related methods treat entire images as a whole and do not concentrate on the features of parts. In fact, a variety of aircraft types have small interclass variance, and the main evidence for classifying subcategories is related to some discriminative object parts. In this paper, we introduce the idea of fine-grained visual classification (FGVC) and attempt to make full use of the features from discriminative object parts. First, multiple class activation mapping (MultiCAM) is proposed to extract the discriminative parts of aircrafts of different categories. Second, we present a mask filter (MF) strategy to enhance the discriminative object parts and filter the interference of the background from original images. Third, a selective connected feature fusion method is proposed to fuse the features extracted from both networks, focusing on the original images and the results of MF, respectively. Compared with the single prediction category in class activation mapping (CAM), MultiCAM makes full use of the predictions of all categories to overcome the wrong discriminative parts produced by a wrong single prediction category. Additionally, the designed MF preserves the object scale information and helps the network to concentrate on the object itself rather than the interfering background. Experiments on a challenging dataset prove that our method can achieve state-of-the-art performance.<\/jats:p>","DOI":"10.3390\/rs11050544","type":"journal-article","created":{"date-parts":[[2019,3,7]],"date-time":"2019-03-07T10:52:22Z","timestamp":1551955942000},"page":"544","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":55,"title":["MultiCAM: Multiple Class Activation Mapping for Aircraft Recognition in Remote Sensing Images"],"prefix":"10.3390","volume":"11","author":[{"given":"Kun","family":"Fu","sequence":"first","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Network Information System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Institute of Electronics, Chinese Academy of Sciences, Suzhou 215000, China"}]},{"given":"Wei","family":"Dai","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Network Information System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Yue","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Network Information System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2877-0384","authenticated-orcid":false,"given":"Zhirui","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Network Information System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Menglong","family":"Yan","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Network Information System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Xian","family":"Sun","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Network Information System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,3,6]]},"reference":[{"key":"ref_1","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Lowe, D.G. (1999, January 20\u201327). Object recognition from local scale-invariant features. Proceedings of the Seventh IEEE International Conference on Computer Vision (ICCV), Kerkyra, Greece.","DOI":"10.1109\/ICCV.1999.790410"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-Invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1049\/ip-vis:20049020","article-title":"Aircraft type recognition in satellite images","volume":"152","author":"Hsieh","year":"2005","journal-title":"IEE Proc.-Vis. Image Signal Process."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1759","DOI":"10.1016\/j.patrec.2009.11.018","article-title":"Artificial bee colony (ABC) optimized edge potential function (EPF) approach to target recognition for low-altitude aircraft","volume":"31","author":"Xu","year":"2010","journal-title":"Pattern Recognit. Lett."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1109\/LGRS.2012.2214022","article-title":"Aircraft recognition in high-resolution satellite images using coarse-to-fine shape prior","volume":"10","author":"Liu","year":"2013","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_7","unstructured":"Dang, L.M., Hassan, S.I., Suhyeon, I., Sangaiah, A.K., Mehmood, I., Rho, S., Seo, S., and Moon, H. (2018). UAV based wilt detection system via convolutional neural networks. Sustain. Comput. Inform. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"042621","DOI":"10.1117\/1.JRS.11.042621","article-title":"Deep convolutional neural network for classifying Fusarium wilt of radish from unmanned aerial vehicles","volume":"11","author":"Ha","year":"2017","journal-title":"J. Appl. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., and Guo, Z. (2018). Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sens., 10.","DOI":"10.3390\/rs10010132"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Fu, K., Li, Y., Sun, H., Yang, X., Xu, G., Li, Y., and Sun, X. (2018). A ship rotation detection model in remote sensing images based on feature fusion pyramid network and deep reinforcement learning. Remote Sens., 10.","DOI":"10.3390\/rs10121922"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1600","DOI":"10.1109\/LGRS.2018.2846802","article-title":"Cloud and cloud shadow detection using multilevel feature fused segmentation network","volume":"15","author":"Yan","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"39401","DOI":"10.1109\/ACCESS.2018.2856088","article-title":"An end-to-end neural network for road extraction from remote sensing imagery by multiple feature pyramid network","volume":"6","author":"Gao","year":"2018","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1080\/2150704X.2015.1072288","article-title":"Object recognition in remote sensing images using sparse deep belief networks","volume":"6","author":"Diao","year":"2015","journal-title":"Remote Sens. Lett."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1109\/LGRS.2017.2715858","article-title":"Aircraft recognition based on landmark detection in remote sensing images","volume":"14","author":"Zhao","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1109\/LGRS.2017.2786232","article-title":"Aircraft type recognition based on segmentation with deep convolutional neural networks","volume":"15","author":"Zuo","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Sun, H., Zuo, J., Wang, H., Xu, G., and Sun, X. (2018). Aircraft type recognition in remote sensing images based on feature learning with conditional generative adversarial networks. Remote Sens., 10.","DOI":"10.3390\/rs10071123"},{"key":"ref_17","unstructured":"Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset, California Institute of Technology. Technical Report CNS-TR-2011-001."},{"key":"ref_18","unstructured":"Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., and Perona, P. (2010). Caltech-UCSD Birds 200, California Institute of Technology. Technical Report CNS-TR-2010-001."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Krause, J., Stark, M., Jia, D., and Li, F.F. (2013, January 2\u20138). 3D object representations for fine-grained categorization. Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, Sydney, NSW, Australia.","DOI":"10.1109\/ICCVW.2013.77"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Huang, S., Xu, Z., Tao, D., and Zhang, Y. (2016, January 27\u201330). Part-stacked CNN for fine-grained visual categorization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.132"},{"key":"ref_23","unstructured":"Wei, X., Xie, C., and Wu, J. (2016, January 27\u201330). Mask-CNN: Localizing parts and selecting descriptors for fine-grained image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_24","unstructured":"Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., and Zhang, Z. (2015, January 7\u201312). The application of two-level attention models in deep convolutional neural network for fine-grained image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, X., Xiong, H., Zhou, W., Lin, W., and Tian, Q. (2016, January 27\u201330). Picking deep filter responses for fine-grained image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.128"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Lin, T., Roychowdhury, A., and Maji, S. (2015, January 7\u201313). Bilinear CNN models for fine-grained visual recognition. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.170"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Fu, J., Zheng, H., and Mei, T. (2017, January 21\u201326). Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.476"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2016, January 27\u201330). Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.319"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1109\/TIP.2017.2774041","article-title":"Object-part attention model for fine-grained image classification","volume":"27","author":"Peng","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Durand, T., Mordan, T., Thome, N., and Cord, M. (2017, January 21\u201326). WILDCAT: Weakly supervised learning of deep ConvNets for image classification, pointwise localization and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.631"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Fu, K., Lu, W., Diao, W., Yan, M., Sun, H., Zhang, Y., and Sun, X. (2018). WSF-NET: Weakly supervised feature-fusion network for binary segmentation in remote sensing image. Remote Sens., 10.","DOI":"10.3390\/rs10121970"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Xia, G.S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201323). DOTA: A large-scale dataset for object detection in aerial images. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images","volume":"54","author":"Gong","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_36","first-page":"91","article-title":"Faster R-CNN: Towards real-time object detection with region proposal networks","volume":"2015","author":"Ren","year":"2015","journal-title":"Neural Inf. Process. Syst."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Lin, T., Dollar, P., Girshick, R.B., He, K., Hariharan, B., and Belongie, S.J. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_38","unstructured":"Sutskever, I., Martens, J., Dahl, G., and Hinton, G. (2013, January 16\u201321). On the importance of initialization and momentum in deep learning. Proceedings of the International Conference on International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_39","unstructured":"Luo, W., Li, Y., Urtasun, R., and Zemel, R.S. (2016, January 5\u201310). Understanding the effective receptive field in deep convolutional neural networks. Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Noh, H., Hong, S., and Han, B. (2015, January 7\u201313). Learning deconvolution network for semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.178"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/5\/544\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:36:45Z","timestamp":1760186205000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/5\/544"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,6]]},"references-count":40,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2019,3]]}},"alternative-id":["rs11050544"],"URL":"https:\/\/doi.org\/10.3390\/rs11050544","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,6]]}}}