{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:38:46Z","timestamp":1760236726905,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2021,12,17]],"date-time":"2021-12-17T00:00:00Z","timestamp":1639699200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["41801265","U2033216","42171448"],"award-info":[{"award-number":["41801265","U2033216","42171448"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>The key to fine-grained aircraft recognition is discovering the subtle traits that can distinguish different subcategories. Early approaches leverage part annotations of fine-grained objects to derive rich representations. However, manual labeling part information is cumbersome. In response to this issue, previous CNN-based methods reuse the backbone network to extract part-discrimination features, the inference process of which consumes much time. Therefore, we introduce generalized multiple instance learning (MIL) into fine-grained recognition. In generalized MIL, an aircraft is assumed to consist of multiple instances (such as head, tail, and body). Firstly, instance-level representations are obtained by the feature extractor and instance conversion component. Secondly, the obtained instance features are scored by an MIL classifier, which can yield high-level part semantics. Finally, a fine-grained object label is inferred by a MIL pooling function that aggregates multiple instance scores. The proposed approach is trained end-to-end without part annotations and complex location networks. Experimental evidence is conducted to prove the feasibility and effectiveness of our approach on combined aircraft images (CAIs).<\/jats:p>","DOI":"10.3390\/rs13245132","type":"journal-article","created":{"date-parts":[[2021,12,20]],"date-time":"2021-12-20T02:40:32Z","timestamp":1639968032000},"page":"5132","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Multiple Instance Learning Convolutional Neural Networks for Fine-Grained Aircraft Recognition"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9389-5610","authenticated-orcid":false,"given":"Xiaolan","family":"Huang","sequence":"first","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1387-2786","authenticated-orcid":false,"given":"Kai","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China"}]},{"given":"Chuming","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China"}]},{"given":"Chengrui","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China"}]},{"given":"Kun","family":"Qin","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_3","first-page":"39","article-title":"Aircraft Identification by Moment Invariants","volume":"100","author":"Dudani","year":"2009","journal-title":"IEEE Trans. Comput."},{"key":"ref_4","first-page":"3771","article-title":"Aircraft recognition model based on moment invariants and neural network","volume":"14","author":"Zhang","year":"2009","journal-title":"Comput. Knowl. Technol."},{"key":"ref_5","unstructured":"Hsieh, J.W., Chen, J.M., Chuang, C.H., and Fan, K.C. (2004, January 24\u201327). Novel aircraft type recognition with learning capabilities in satellite images. Proceedings of the 2004 International Conference on Image Processing, Singapore."},{"key":"ref_6","first-page":"51","article-title":"Aircraft target recognition in remote sensing image using independent component analysis Zernike moments","volume":"6","author":"Liu","year":"2011","journal-title":"CAAI Trans. Intell. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, D., Xin, H., Wei, Z., and Yu, H. (2009, January 16\u201319). A method of aircraft image target recognition based on modified PCA features and SVM. Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments, Beijing, China.","DOI":"10.1109\/ICEMI.2009.5274100"},{"key":"ref_8","first-page":"136","article-title":"A Method of Tree Classifier for the Recognition of Airplane Types","volume":"28","author":"Ke","year":"2006","journal-title":"Comput. Eng. Sci."},{"key":"ref_9","unstructured":"Zhu, X., Ma, B., Guo, G., and Liu, G. (2016, January 12\u201314). Aircraft Type Classification Based on an Optimized Bag of Words Model. Proceedings of the 2016 IEEE Chinese Guidance Navigation and Control Conference, Nanjing, China."},{"key":"ref_10","first-page":"261","article-title":"Aircraft recognition algorithm based on PCA and image matching","volume":"14","author":"Zhao","year":"2009","journal-title":"Chin. J. Stereol. Image Anal."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1109\/LGRS.2017.2715858","article-title":"Aircraft Recognition Based on Landmark Detection in Remote Sensing Images","volume":"14","author":"Zhao","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_12","first-page":"1097","article-title":"ImageNet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_13","unstructured":"Simonyan, K., and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1080\/2150704X.2015.1072288","article-title":"Object recognition in remote sensing images using sparse deep belief networks","volume":"6","author":"Diao","year":"2015","journal-title":"Remote Sens. Lett."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1109\/LGRS.2017.2786232","article-title":"Aircraft type recognition based on segmentation with deep convolutional neural networks","volume":"15","author":"Zuo","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Sun, H., Zuo, J., Wang, H., Xu, G., and Sun, X. (2018). Aircraft type recognition in remote sensing images based on feature learning with conditional generative adversarial networks. Remote Sens., 10.","DOI":"10.3390\/rs10071123"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhang, N., Donahue, J., Girshick, R., and Darrell, T. (2014, January 6\u201312). Part-based R-CNNs for fine-grained category detection. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10590-1_54"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Fu, K., Dai, W., Zhang, Y., Wang, Z., Yan, M., and Sun, X. (2019). Multicam: Multiple class activation mapping for aircraft recognition in remote sensing images. Remote Sens., 11.","DOI":"10.3390\/rs11050544"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Xiong, Y., Niu, X., Dou, Y., Qie, H., and Wang, K. (2020). Non-locally Enhanced Feature Fusion Network for Aircraft Recognition in Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12040681"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1259","DOI":"10.1109\/TIP.2017.2772836","article-title":"Semi-Supervised Deep Learning Using Pseudo Labels for Hyperspectral Image Classification","volume":"27","author":"Wu","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_23","unstructured":"Tarvainen, A., and Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Kang, J., Fernandez-Baltran, R., Ye, Z., Xiaohua, T., Ghamisi, P., and Plaza, A. (2020). High-rankness regularized semi-supervised deep metric learning for remote sensing imagery. Remote Sens., 12.","DOI":"10.3390\/rs12162603"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Protopapadakis, E., Doulamis, A., Doulamis, N., and Maltezos, E. (2021). Stacked autoencoders driven by semi-supervised learning for building extraction from near infrared remote sensing imagery. Remote Sens., 13.","DOI":"10.3390\/rs13030371"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Fang, B., Li, Y., Zhang, H., and Chan, J. (2018). Semi-supervised deep learning classification for hyperspectral image based on dual-strategy sample selection. Remote Sens., 10.","DOI":"10.3390\/rs10040574"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1016\/S0004-3702(96)00034-3","article-title":"Solving the multiple instance problem with axis-parallel rectangles","volume":"89","author":"Dietterich","year":"1997","journal-title":"Artif. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Pinheiro, P.O., and Collobert, R. (2015, January 8\u201310). From image-level to pixel-level labeling with convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298780"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wu, J., Yu, Y., Huang, C., and Yu, K. (2015, January 8\u201310). Deep multiple instance learning for image classification and auto-annotation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298968"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1016\/j.sigpro.2011.03.004","article-title":"LSA based multi-instance learning algorithm for image retrieval","volume":"91","author":"Li","year":"2011","journal-title":"Signal. Process."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Sun, M., Han, T.X., Liu, M.-C., and Khodayari-Rostamabad, A. (2016, January 4\u20138). Multiple instance learning convolutional neural networks for object recognition. Proceedings of the 2016 23rd International Conference on Pattern Recognition, Cancun, Mexico.","DOI":"10.1109\/ICPR.2016.7900139"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Fan, M., Chakraborti, T., Eric, I., Chang, C., Xu, Y., and Rittscher, J. (2020, January 3\u20137). Fine-Grained Multi-Instance Classification in Microscopy Through Deep Attention. Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging, Iowa City, IA, USA.","DOI":"10.1109\/ISBI45749.2020.9098704"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"3685","DOI":"10.1109\/TGRS.2019.2960889","article-title":"Deep multiple instance convolutional neural networks for learning robust scene representations","volume":"58","author":"Li","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_34","unstructured":"Ilse, M., Tomczak, J., and Welling, M. (2018, January 10\u201315). Attention-based deep multiple instance learning. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_35","unstructured":"Chopra, S., Hadsell, R., and LeCun, Y. (2005, January 20\u201325). Learning a similarity metric discriminatively with application to face verification. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"4683","DOI":"10.1109\/TIP.2020.2973812","article-title":"The devil is in the channels: Mutual-channel loss for fine-grained image classification","volume":"29","author":"Chang","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_37","unstructured":"Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013, January 7\u201319). Maxout networks. Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_38","first-page":"481","article-title":"A Survey of Multi-instance Learning Algorithms for Image Semantic Analysis","volume":"28","author":"Li","year":"2013","journal-title":"Control and Decision."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Weidmann, N., Frank, E., and Pfahringer, B. (2003, January 22\u201326). A two-level learning method for generalized multi-instance problems. Proceedings of the European Conference on Machine Learning, Cavtat-Dubrovnik, Croatia.","DOI":"10.1007\/978-3-540-39857-8_42"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_41","unstructured":"(2020, July 01). Gaofen Challenge on Automated High-Resolution Earth Observation Image Interpretation. Available online: http:\/\/en.sw.chreos.org."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1016\/j.isprsjprs.2019.11.023","article-title":"Object detection in optical remote sensing images: A survey and a new benchmark","volume":"159","author":"Li","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201322). DOTA: A large-scale dataset for object detection in aerial images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/24\/5132\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:50:43Z","timestamp":1760169043000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/24\/5132"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,17]]},"references-count":43,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["rs13245132"],"URL":"https:\/\/doi.org\/10.3390\/rs13245132","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2021,12,17]]}}}