{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:55:54Z","timestamp":1760241354696,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2018,2,9]],"date-time":"2018-02-09T00:00:00Z","timestamp":1518134400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772448","61402394","61603326","61602400"],"award-info":[{"award-number":["61772448","61402394","61603326","61602400"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Natural Science Foundation of Jiangsu Province of China","award":["BK20140462"],"award-info":[{"award-number":["BK20140462"]}]},{"name":"Natural Science Major Project of the Higher Education Institutions of Jiangsu Province of China","award":["17KJA520006"],"award-info":[{"award-number":["17KJA520006"]}]},{"name":"The Fundamental Research Funds for the Central Universities","award":["2014B33114"],"award-info":[{"award-number":["2014B33114"]}]},{"name":"The Graduate Student Scientific Research Innovation Projects in Jiangsu Province","award":["KYLX_0436"],"award-info":[{"award-number":["KYLX_0436"]}]},{"name":"The Key Natural Science Foundation of the Colleges and Universities in Anhui Province of China","award":["KJ2016A592"],"award-info":[{"award-number":["KJ2016A592"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>The objective of this work is image classification, whose purpose is to group images into corresponding semantic categories. Four contributions are made as follows: (i) For computational simplicity and efficiency, we directly adopt raw image patch vectors as local descriptors encoded by Fisher vector (FV) subsequently; (ii) For obtaining representative local features within the FV encoding framework, we compare and analyze three typical sampling strategies: random sampling, saliency-based sampling and dense sampling; (iii) In order to embed both global and local spatial information into local features, we construct an improved spatial geometry structure which shows good performance; (iv) For reducing the storage and CPU costs of high dimensional vectors, we adopt a new feature selection method based on supervised mutual information (MI), which chooses features by an importance sorting algorithm. We report experimental results on dataset STL-10. It shows very promising performance with this simple and efficient framework compared to conventional methods.<\/jats:p>","DOI":"10.3390\/info9020038","type":"journal-article","created":{"date-parts":[[2018,2,9]],"date-time":"2018-02-09T12:46:27Z","timestamp":1518180387000},"page":"38","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Local Patch Vectors Encoded by Fisher Vectors for Image Classification"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6337-2747","authenticated-orcid":false,"given":"Shuangshuang","family":"Chen","sequence":"first","affiliation":[{"name":"Institute of Intelligence Science and Technology, Hohai University, Nanjing 211100, China"},{"name":"School of Information Science and Technology, Yancheng Teachers University, Yancheng 224002, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huiyi","family":"Liu","sequence":"additional","affiliation":[{"name":"Institute of Intelligence Science and Technology, Hohai University, Nanjing 211100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoqin","family":"Zeng","sequence":"additional","affiliation":[{"name":"Institute of Intelligence Science and Technology, Hohai University, Nanjing 211100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Subin","family":"Qian","sequence":"additional","affiliation":[{"name":"Institute of Intelligence Science and Technology, Hohai University, Nanjing 211100, China"},{"name":"School of Information Science and Technology, Yancheng Teachers University, Yancheng 224002, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Wei","sequence":"additional","affiliation":[{"name":"College of Education, Anqing Normal University, Anqing 246133, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guomin","family":"Wu","sequence":"additional","affiliation":[{"name":"Institute of Intelligence Science and Technology, Hohai University, Nanjing 211100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Baobin","family":"Duan","sequence":"additional","affiliation":[{"name":"Institute of Intelligence Science and Technology, Hohai University, Nanjing 211100, China"},{"name":"Department of Mathematics and Physics, Hefei University, Hefei 230601, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2018,2,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1109\/83.892448","article-title":"Image classification for content-based indexing","volume":"10","author":"Vailaya","year":"2001","journal-title":"IEEE Trans. Image Process."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/TCSVT.2003.818349","article-title":"An introduction to biometric recognition","volume":"14","author":"Jain","year":"2004","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/360402.360406","article-title":"Web mining research: A survey","volume":"2","author":"Kosala","year":"2000","journal-title":"ACM Sigkdd Explor. Newsl."},{"key":"ref_4","unstructured":"Collins, R.T., Lipton, A.J., Kanade, T., Fujiyoshi, H., Duggins, D., Tsin, Y., Tolliver, D., Enomoto, N., Hasegawa, O., and Burt, P. (2000). A System for Video Surveillance and Monitoring, The Robotics Institute, Carnegie Mellon University. VSAM Final Report."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_6","first-page":"404","article-title":"SURF: Speeded Up Robust Features","volume":"110","author":"Bay","year":"2006","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_7","unstructured":"Dalal, N., and Triggs, B. (2005, January 21\u201323). Histograms of oriented gradients for human detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Ahonen, T., Hadid, A., and Pietik\u00e4inen, M. (2004, January 11\u201314). Face Recognition with Local Binary Patterns. Proceedings of the European Conference on Computer Vision, Prague, Czech Republic.","DOI":"10.1007\/978-3-540-24670-1_36"},{"key":"ref_9","unstructured":"Wang, Z., Fan, B., and Wu, F. (2011, January 6\u201313). Local Intensity Order Pattern for feature description. Proceedings of the International Conference on Computer Vision, Barcelona, Spain."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Alcantarilla, P.F., Bartoli, A., and Davison, A.J. (2012, January 7\u201313). KAZE Features. Proceedings of the European Conference on Computer Vision, Florence, Italy.","DOI":"10.1007\/978-3-642-33783-3_16"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"3368","DOI":"10.1016\/j.eswa.2014.11.069","article-title":"Scene classification based on single-layer SAE and SVM","volume":"42","author":"Yin","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_12","unstructured":"S\u00e1nchez, J., Perronnin, F., Mensink, T., and Verbeek, J. (2013). Compressed Fisher Vectors for Large-Scale Image Classification, HAL-Inria. Research Report RR-8209."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shi, H., Zhu, X., Lei, Z., Liao, S., and Li, S.Z. (2016, January 13\u201318). Learning Discriminative Features with Class Encoder. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, San Francisco, CA, USA.","DOI":"10.1109\/CVPRW.2016.143"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zeiler, M.D., and Fergus, R. (2014, January 6\u201312). Visualizing and understanding convolutional networks. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"ref_15","unstructured":"Gao, B.B., Wei, X.S., Wu, J., and Lin, W. (arXiv, 2015). Deep spatial pyramid: The devil is once again in the details, arXiv."},{"key":"ref_16","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very deep convolutional networks for large-scale image recognition, arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1016\/j.sigpro.2016.05.021","article-title":"A practical guide to CNNs and Fisher Vectors for image instance retrieval","volume":"128","author":"Chandrasekhar","year":"2016","journal-title":"Signal Process."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Jurie, F., and Triggs, B. (2005, January 17\u201321). Creating Efficient Codebooks for Visual Recognition. Proceedings of the IEEE International Conference on Computer Vision, Beijing, China.","DOI":"10.1109\/ICCV.2005.66"},{"key":"ref_19","unstructured":"Nowak, E., Jurie, F., and Triggs, B. (2008, January 12\u201318). Sampling Strategies for Bag-of-Features Image Classification. Proceedings of the European Conference on Computer Vision, Marseille, France."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1007\/s11263-006-9794-4","article-title":"Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study","volume":"73","author":"Zhang","year":"2007","journal-title":"Int. J. Comput. Vis."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hu, J., Xia, G.S., Hu, F., and Sun, H. (2015, January 13\u201318). A comparative study of sampling analysis in scene classification of high-resolution remote sensing imagery. Proceedings of the Geoscience and Remote Sensing Symposium, Milan, Italy.","DOI":"10.1109\/IGARSS.2015.7326290"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2407","DOI":"10.1109\/TIP.2016.2549360","article-title":"Compact Representation of High-Dimensional Feature Vectors for Large-Scale Image Recognition and Retrieval","volume":"25","author":"Zhang","year":"2016","journal-title":"IEEE Trans. Image Process."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Shi, F., Petriu, E., and Laganiere, R. (2013, January 25\u201327). Sampling Strategies for Real-Time Action Recognition. Proceedings of the Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.335"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1109\/34.990146","article-title":"A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition","volume":"24","author":"Salah","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1109\/TPAMI.2012.89","article-title":"State-of-the-Art in Visual Attention Modeling","volume":"35","author":"Borji","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gu, Z., Zhang, L., and Li, H. (2013, January 15\u201318). Learning a blind image quality index based on visual saliency guided sampling and Gabor filtering. Proceedings of the IEEE International Conference on Image Processing, Melbourne, Australia.","DOI":"10.1109\/ICIP.2013.6738039"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhang, L., Gu, Z., and Li, H. (2014, January 27\u201330). SDSP: A novel saliency detection method by combining simple priors. Proceedings of the IEEE International Conference on Image Processing, Paris, France.","DOI":"10.1109\/ICIP.2013.6738036"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2978","DOI":"10.1109\/TCYB.2015.2493538","article-title":"Good practices for learning to recognize actions using FV and VLAD","volume":"46","author":"Wu","year":"2016","journal-title":"IEEE Trans. Cybern."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Ma, B., Su, Y., and Jurie, F. (2012, January 7\u201313). Local Descriptors Encoded by Fisher Vectors for Person Re-identification. Proceedings of the European Conference on Computer Vision, Florence, Italy.","DOI":"10.1007\/978-3-642-33863-2_41"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1007\/s11263-013-0636-x","article-title":"Image Classification with the Fisher Vector: Theory and Practice","volume":"105","author":"Perronnin","year":"2013","journal-title":"Int. J. Comput. Vis."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Wu, J., and Cai, J. (2014, January 24\u201327). Compact representation for image classification: To choose or to compress?. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.121"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1109\/TPAMI.2010.57","article-title":"Product quantization for nearest neighbor search","volume":"33","author":"Douze","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Sanchez, J., and Perronnin, F. (2011, January 20\u201325). High-dimensional signature compression for large-scale image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995504"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Joachims, T. (2006, January 20\u201323). Training linear SVMs in linear time. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA.","DOI":"10.1145\/1150402.1150429"},{"key":"ref_35","first-page":"1871","article-title":"LIBLINEAR: A library for large linear classification","volume":"9","author":"Fan","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Vedaldi, A., and Fulkerson, B. (2010, January 25\u201329). VLFeat: An open and portable library of computer vision algorithms. Proceedings of the 18th ACM International Conference on Multimedia, Firenze, Italy.","DOI":"10.1145\/1873951.1874249"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Miclut, B. (2014, January 2\u20135). Committees of deep feedforward networks trained with few data. Proceedings of the German Conference on Pattern Recognition, M\u00fcnster, Germany.","DOI":"10.1007\/978-3-319-11752-2_62"},{"key":"ref_38","unstructured":"Coates, A., and Ng, A.Y. (2011, January 12\u201317). Selecting receptive fields in deep networks. Proceedings of the Advances in Neural Information Processing Systems, Granada, Spain."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1109\/TCYB.2016.2536638","article-title":"Stacked convolutional denoising auto-encoders for feature representation","volume":"47","author":"Du","year":"2017","journal-title":"IEEE Trans. Cybern."},{"key":"ref_40","unstructured":"Zou, W.Y., Ng, A.Y., Zhu, S., and Yu, K. (2012, January 3\u20136). Deep learning of invariant features via simulated fixations in video. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_41","unstructured":"Romero, A., Radeva, P., and Gatta, C. (arXiv, 2014). No more meta-parameter tuning in unsupervised sparse feature learning, arXiv."},{"key":"ref_42","unstructured":"Hui, K.Y. (2013, January 16\u201321). Direct modeling of complex invariances for visual object features. Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_43","unstructured":"Bo, L., Ren, X., and Fox, D. Unsupervised feature learning for RGB-D based object recognition. Proceedings of the 13th International Symposium on Experimental Robotics."},{"key":"ref_44","unstructured":"Zhao, J., Mathieu, M., Goroshin, R., and Lecun, Y. (2016, January 2\u20134). Stacked What-Where Auto-encoders. Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico."},{"key":"ref_45","unstructured":"Dosovitskiy, A., Springenberg, J.T., and Riedmiller, M. (2014, January 8\u201313). Discriminative unsupervised feature learning with convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/9\/2\/38\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T14:54:29Z","timestamp":1760194469000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/9\/2\/38"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,2,9]]},"references-count":45,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2018,2]]}},"alternative-id":["info9020038"],"URL":"https:\/\/doi.org\/10.3390\/info9020038","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2018,2,9]]}}}