{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T03:30:17Z","timestamp":1772681417122,"version":"3.50.1"},"reference-count":44,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2016,2,19]],"date-time":"2016-02-19T00:00:00Z","timestamp":1455840000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>High spatial resolution (HSR) image scene classification is aimed at bridging the semantic gap between low-level features and high-level semantic concepts, which is a challenging task due to the complex distribution of ground objects in HSR images. Scene classification based on the bag-of-visual-words (BOVW) model is one of the most successful ways to acquire the high-level semantic concepts. However, the BOVW model assigns local low-level features to their closest visual words in the \u201cvisual vocabulary\u201d (the codebook obtained by k-means clustering), which discards too many useful details of the low-level features in HSR images. In this paper, a feature coding method under the Fisher kernel (FK) coding framework is introduced to extend the BOVW model by characterizing the low-level features with a gradient vector instead of the count statistics in the BOVW model, which results in a significant decrease in the codebook size and an acceleration of the codebook learning process. By considering the differences in the distributions of the ground objects in different regions of the images, local FK (LFK) is proposed for the HSR image scene classification method. The experimental results show that the proposed scene classification methods under the FK coding framework can greatly reduce the computational cost, and can obtain a better scene classification accuracy than the methods based on the traditional BOVW model.<\/jats:p>","DOI":"10.3390\/rs8020157","type":"journal-article","created":{"date-parts":[[2016,2,19]],"date-time":"2016-02-19T11:29:39Z","timestamp":1455881379000},"page":"157","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":102,"title":["The Fisher Kernel Coding Framework for High Spatial Resolution Scene Classification"],"prefix":"10.3390","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0206-1427","authenticated-orcid":false,"given":"Bei","family":"Zhao","sequence":"first","affiliation":[{"name":"State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China"},{"name":"Department of Geography and Resource Management, The Chinese University of Hong Kong, Sha Tin, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9446-5850","authenticated-orcid":false,"given":"Yanfei","family":"Zhong","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China"}]},{"given":"Liangpei","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China"}]},{"given":"Bo","family":"Huang","sequence":"additional","affiliation":[{"name":"Department of Geography and Resource Management, The Chinese University of Hong Kong, Sha Tin, Hong Kong"}]}],"member":"1968","published-online":{"date-parts":[[2016,2,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1016\/j.isprsjprs.2013.09.014","article-title":"Geographic object-based image analysis\u2014Towards a new paradigm","volume":"87","author":"Blaschke","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1109\/JPROC.2012.2211551","article-title":"Land-cover mapping by markov modeling of spatial-contextual information in very-high-resolution remote sensing images","volume":"101","author":"Moser","year":"2013","journal-title":"Proc. IEEE"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1109\/TGRS.2013.2244604","article-title":"Multiagent object-based classifier for high spatial resolution imagery","volume":"52","author":"Zhong","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"7023","DOI":"10.1109\/TGRS.2014.2306692","article-title":"A hybrid object-oriented conditional random field classification framework for high spatial resolution remote sensing imagery","volume":"52","author":"Zhong","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"8424","DOI":"10.3390\/rs6098424","article-title":"A multichannel gray level co-occurrence matrix for multi\/hyperspectral image texture representation","volume":"6","author":"Huang","year":"2014","journal-title":"Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3947","DOI":"10.1109\/TGRS.2011.2128330","article-title":"Hyperspectral image segmentation using a new bayesian approach with active learning","volume":"49","author":"Li","year":"2011","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3198","DOI":"10.1109\/TGRS.2010.2044508","article-title":"Rule-based classification of a very high resolution image in an urban environment using multispectral segmentation guided by cartographic data","volume":"48","author":"Bouziani","year":"2010","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2825","DOI":"10.1080\/01431161003745608","article-title":"Multi-scale geobia with very high spatial resolution digital aerial imagery: Scale, texture and image objects","volume":"32","author":"Kim","year":"2011","journal-title":"Int. J. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1109\/LGRS.2009.2023536","article-title":"Semantic annotation of satellite images using latent dirichlet allocation","volume":"7","author":"Lienou","year":"2010","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1357","DOI":"10.1109\/LGRS.2015.2402391","article-title":"A comparative study of bag-of-words and bag-of-topics models of eo image patches","volume":"12","author":"Bahmanyar","year":"2015","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.rse.2015.07.017","article-title":"A linear dirichlet mixture model for decomposing scenes: Application to analyzing urban functional zonings","volume":"169","author":"Zhang","year":"2015","journal-title":"Remote Sens. Environ."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhao, B., Zhong, Y., Xia, G.S., and Zhang, L. (2015). Dirichlet-derived multiple topic scene classification model for high spatial resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens.","DOI":"10.1109\/TGRS.2015.2496185"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1109\/LGRS.2009.2014400","article-title":"Urban-area segmentation using visual words","volume":"6","author":"Weizman","year":"2009","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1109\/LGRS.2009.2035644","article-title":"Object classification of aerial images with bag-of-visual words","volume":"7","author":"Xu","year":"2010","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1109\/TGRS.2014.2351395","article-title":"Pyramid of spatial relatons for scene-level land use classification","volume":"53","author":"Chen","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"4620","DOI":"10.1109\/JSTARS.2014.2339842","article-title":"Land-use scene classification using a concentric circle-structured multiscale bag-of-visual-words model","volume":"7","author":"Zhao","year":"2014","journal-title":"IEEE J. Sel. Topics Appl. Earth Observ."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2015","DOI":"10.1109\/JSTARS.2015.2444405","article-title":"Unsupervised feature learning via spectral clustering of multidimensional patches for remotely sensed scene classification","volume":"8","author":"Hu","year":"2015","journal-title":"IEEE J. Sel. Topics Appl. Earth Observ."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1055","DOI":"10.1109\/LGRS.2012.2228625","article-title":"High-resolution remote-sensing image classification via an approximate earth mover\u2019s distance-based bag-of-features model","volume":"10","author":"Zhang","year":"2013","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"4238","DOI":"10.1109\/TGRS.2015.2393857","article-title":"Effective and efficient midlevel visual elements-oriented land-use classification using vhr remote sensing images","volume":"53","author":"Cheng","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.isprsjprs.2014.10.002","article-title":"Multi-class geospatial object detection and geographic image classification based on collection of part detectors","volume":"98","author":"Cheng","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3325","DOI":"10.1109\/TGRS.2014.2374218","article-title":"Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning","volume":"53","author":"Han","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2175","DOI":"10.1109\/TGRS.2014.2357078","article-title":"Saliency-guided unsupervised feature learning for scene classification","volume":"53","author":"Zhang","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1049\/iet-cvi.2014.0270","article-title":"Auto-encoder-based shared mid-level visual dictionary learning for scene classification using very high resolution remote sensing images","volume":"9","author":"Cheng","year":"2015","journal-title":"Comput. Vis. IET"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1109\/TSMC.1973.4309314","article-title":"Textural features for image classification","volume":"SMC-3","author":"Haralick","year":"1973","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_26","unstructured":"Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C. (2004, January 10\u201316). Visual categorization with bags of keypoints. Proceedings of the Workshop on Statistical Learning in Computer Vision, ECCV, Prague, Czech Republic."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1356","DOI":"10.1109\/TGRS.2013.2250978","article-title":"Semantic annotation of satellite images using author-genre-topic model","volume":"52","author":"Luo","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1145\/2133806.2133826","article-title":"Probabilistic topic models","volume":"55","author":"Blei","year":"2012","journal-title":"Commun. ACM"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"6207","DOI":"10.1109\/TGRS.2015.2435801","article-title":"Scene classification based on the multifeature fusion probabilistic topic model for high spatial resolution remote sensing imagery","volume":"53","author":"Zhong","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1204","DOI":"10.1080\/2150704X.2013.858843","article-title":"Scene classification via latent dirichlet allocation using a hybrid generative\/discriminative strategy for high spatial resolution remote sensing imagery","volume":"4","author":"Zhao","year":"2013","journal-title":"Remote Sens. Lett."},{"key":"ref_31","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_32","unstructured":"Lazebnik, S., Schmid, C., and Ponce, J. (2006, January 17\u201322). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New York, NY, USA."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the ACM SIGSPATIAL Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_34","unstructured":"Yang, Y., and Newsam, S. (2011, January 6\u201313). Spatial pyramid co-occurrence for image classification. Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1109\/TGRS.2013.2241444","article-title":"Unsupervised feature learning for aerial scene classification","volume":"52","author":"Cheriyadat","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2395","DOI":"10.1080\/01431161.2011.608740","article-title":"High-resolution satellite scene classification using a sparse coding based multiple feature combination","volume":"33","author":"Sheng","year":"2012","journal-title":"Int. J. Remote Sens."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1109\/LGRS.2012.2216499","article-title":"Automatic annotation of satellite images via multifeature joint sparse coding with spatial relation constraint","volume":"10","author":"Zheng","year":"2013","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1109\/LGRS.2010.2055033","article-title":"Satellite image classification via two-layer sparse coding with biased image representation","volume":"8","author":"Dai","year":"2011","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1109\/TPAMI.2013.113","article-title":"Feature coding in image classification: A comprehensive study","volume":"36","author":"Huang","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.isprsjprs.2013.12.011","article-title":"Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding","volume":"89","author":"Han","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_41","first-page":"143","article-title":"Improving the fisher kernel for large-scale image classification","volume":"Volume 6314","author":"Daniilidis","year":"2010","journal-title":"Proceedings of the European Conference on Computer Vision (ECCV)"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Perronnin, F., and Dance, C. (2007, January 17\u201322). Fisher kernels on visual vocabularies for image categorization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383266"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1961189.1961199","article-title":"Libsvm: A library for support vector machines","volume":"2","author":"Chang","year":"2011","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_44","unstructured":"Barla, A., Odone, F., and Verri, A. (2003, January 14\u201317). Histogram intersection kernel for image classification. Proceedings of the IEEE International Conference on Image Processing, Barcelona, Spain."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/8\/2\/157\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:19:24Z","timestamp":1760210364000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/8\/2\/157"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,2,19]]},"references-count":44,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2016,2]]}},"alternative-id":["rs8020157"],"URL":"https:\/\/doi.org\/10.3390\/rs8020157","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,2,19]]}}}