{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T15:05:55Z","timestamp":1761663955592,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2019,9,4]],"date-time":"2019-09-04T00:00:00Z","timestamp":1567555200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Provincial Science and Technology Innovation Special Fund Project of Jilin Province","award":["20190302026GX"],"award-info":[{"award-number":["20190302026GX"]}]},{"name":"Jilin Province Development and Reform Commission Industrial Technology Research and Development Project","award":["2019C054-4"],"award-info":[{"award-number":["2019C054-4"]}]},{"name":"State Key Laboratory of Applied Optics Open Fund Project","award":["20173660"],"award-info":[{"award-number":["20173660"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>Image retrieval applying deep convolutional features has achieved the most advanced performance in most standard benchmark tests. In image retrieval, deep metric learning (DML) plays a key role and aims to capture semantic similarity information carried by data points. However, two factors may impede the accuracy of image retrieval. First, when learning the similarity of negative examples, current methods separate negative pairs into equal distance in the embedding space. Thus, the intraclass data distribution might be missed. Second, given a query, either a fraction of data points, or all of them, are incorporated to build up the similarity structure, which makes it rather complex to calculate similarity or to choose example pairs. In this study, in order to achieve more accurate image retrieval, we proposed a method based on learning to rank and multiple loss (LRML). To address the first problem, through learning the ranking sequence, we separate the negative pairs from the query image into different distance. To tackle the second problem, we used a positive example in the gallery and negative sets from the bottom five ranked by similarity, thereby enhancing training efficiency. Our significant experimental results demonstrate that the proposed method achieves state-of-the-art performance on three widely used benchmarks.<\/jats:p>","DOI":"10.3390\/ijgi8090393","type":"journal-article","created":{"date-parts":[[2019,9,5]],"date-time":"2019-09-05T03:22:36Z","timestamp":1567653756000},"page":"393","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Image Retrieval Based on Learning to Rank and Multiple Loss"],"prefix":"10.3390","volume":"8","author":[{"given":"Lili","family":"Fan","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongwei","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"},{"name":"State Key Laboratory of Applied Optics, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoyu","family":"Zhao","sequence":"additional","affiliation":[{"name":"Editorial Department of Journal (Engineering and Technology Edition), Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pingping","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huangshui","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,9,4]]},"reference":[{"key":"ref_1","unstructured":"Gong, Y.C., Wang, L.W., Guo, R.Q., and Lazebnik, S. (Volume 8695). Multi-scale orderless pooling of deep convolutional activation features. European Conference on Computer Vision, Springer. Part VII."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Oquab, M., Bottou, L., Laptev, I., and Sivic, J. (2014, January 24\u201327). Learning and transferring mid-level image representations using convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.222"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.1109\/TPAMI.2017.2711011","article-title":"NetVLAD: CNN Architecture for Weakly Supervised Place Recognition","volume":"40","author":"Arandjelovic","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Hershey, J.R., Chen, Z., Le Roux, J., and Watanabe, S. (2016, January 20\u201325). Deep clustering: Discriminative embeddings for segmentation and separation. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Shanghai, China.","DOI":"10.1109\/ICASSP.2016.7471631"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). FaceNet: A Unified Embedding for Face Recognition and Clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_6","unstructured":"Cui, Y., Zhou, F., Lin, Y., and Belongie, S. (July, January 26). Fine-grained categorization and dataset bootstrapping using deep metric learning with humans in the loop. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Las Vegas, NV, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Chen, W.H., Chen, X.T., Zhang, J.G., and Huang, K.Q. (2017, January 21\u201326). Beyond triplet loss: A deep quadruplet network for person re-identification. Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (Cvpr 2017), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.145"},{"key":"ref_8","unstructured":"Hermans, A., Beyer, L., and Leibe, B. (2017). In Defense of the Triplet Loss for Person Re-Identification. arXiv."},{"key":"ref_9","unstructured":"Xiao, Q., Luo, H., and Zhang, C. (2017). Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification. arXiv."},{"key":"ref_10","unstructured":"Tarvainen, A., and Valpola, H. (2017, January 4\u20139). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Proceedings of the Advances in Neural Information Processing Systems 30 (Nips 2017), Long Beach, CA, USA."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007, January 17\u201322). Object retrieval with large vocabularies and fast spatial matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383172"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2008, January 24\u201326). Lost in quantization: Improving particular object retrieval in large scale image databases. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AL, USA.","DOI":"10.1109\/CVPR.2008.4587635"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Jegou, H., Douze, M., and Schmid, C. (2008, January 12\u201318). Hamming embedding and weak geometric consistency for large scale image search. I. Proceedings of 10th European Conference on Computer Vision, ECCV 2008, Marseille, France.","DOI":"10.1007\/978-3-540-88682-2_24"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Babenko, A., Slesarev, A., Chigorin, A., and Lempitsky, V. (2014, January 6\u201312). Neural codes for image retrieval. Proceedings of the Computer Vision\u2014Eccv 2014, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10590-1_38"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Husain, S.S., and Bober, M. (2019). REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval. IEEE Trans. Image Proc.","DOI":"10.1109\/TIP.2019.2917234"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Perronnin, F., Liu, Y., Sanchez, J., and Poirier, H. (2010, January 13\u201318). Large-scale image retrieval with compressed fisher vectors. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540009"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1704","DOI":"10.1109\/TPAMI.2011.235","article-title":"Aggregating Local Image Descriptors into Compact Codes","volume":"34","author":"Jegou","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Radenovic, F., Jegou, H., and Chum, O. (2015, January 23\u201326). Multiple measurements and joint dimensionality reduction for large scale image search with short vectors. Proceedings of the ICMR\u201915: Proceedings of the 2015 Acm International Conference on Multimedia Retrieval, Shanghai, China.","DOI":"10.1145\/2671188.2749366"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., and Zisserman, A. (2013, January 23\u201328). All about VLAD. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.207"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Tolias, G., Furon, T., and Jegou, H. (2014, January 6\u201312). Orientation Covariant Aggregation of Local Descriptors with Embeddings. VI. Proceedings of the Computer Vision\u2014ECCV 2014, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10599-4_25"},{"key":"ref_21","unstructured":"Tolias, G., and Sicre, R. (2015). Particular object retrieval with integral max-pooling of CNN activations. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Kalantidis, Y., Mellina, C., and Osindero, S. (2016). Cross-Dimensional Weighting for Aggregated Deep Convolutional Features. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46604-0_48"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Mohedano, E., McGuinness, K., O\u2019Connor, N.E., Salvador, A., Marques, F., and Giro-I-Nieto, X. (2016, January 6\u20139). Bags of Local Convolutional Features for Scalable Instance Search. Proceedings of the ICMR\u201916: Proceedings of the 2016 Acm International Conference on Multimedia Retrieval, New York, NY, USA.","DOI":"10.1145\/2911996.2912061"},{"key":"ref_24","unstructured":"Ong, E.J., Husain, S., and Bober, M. (2017). Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Chum, O., Mikulik, A., Perdoch, M., and Matas, J. (2011, January 20\u201325). Total recall II: Query expansion revisited. Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995601"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"3466","DOI":"10.1016\/j.patcog.2014.04.007","article-title":"Visual query expansion with or without geometry: Refining local descriptors by feature aggregation","volume":"47","author":"Tolias","year":"2014","journal-title":"Pattern Recognit."},{"key":"ref_27","first-page":"1655","article-title":"Fine-tuning CNN Image Retrieval with No Human Annotation","volume":"41","author":"Tolias","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1229","DOI":"10.1109\/TPAMI.2013.237","article-title":"Spatially-constrained similarity measurefor large-scale object retrieval","volume":"36","author":"Shen","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Avrithis, Y., and Kalantidis, Y. (2012, January 7\u201313). Approximate gaussian mixtures for large scale vocabularies.computer vision. III. Proceedings of the 12th European Conference on Computer Vision, Florence, Italy.","DOI":"10.1007\/978-3-642-33712-3_2"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"ImageNet Large Scale Visual Recognition Challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Gordo, A., Almazan, J., Revaud, J., and Larlus, D. (2016, January 11\u201314). Deep image retrieval: learning global representations for image search. VI. Proceedings of the Computer Vision\u2014Eccv 2016, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46466-4_15"},{"key":"ref_32","unstructured":"Chopra, S., Hadsell, R., and Lecun, Y. (2005, January 20). Learning a Similarity Metric Discriminatively, with Application to Face Verification. Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition (CVPR), Toronto, ON, Canada."},{"key":"ref_33","unstructured":"Hadsell, R., Chopra, S., and Lecun, Y. (2006, January 17). Dimensionality Reduction by Learning an Invariant Mapping. Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition (CVPR), New York, NY, USA."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, J., Song, Y., Leung, T., Rosenberg, C., and Wu, Y. (2014, January 24\u201327). Learning Fine-Grained Image Similarity with Deep Ranking. Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.180"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., Radenovi\u0107, F., Chum, O., and Frahm, J.M. (2015, January 7\u201312). From single image query to detailed 3d reconstruction. Proceedings of the Computer Vision & Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299148"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Razavian, A.S., Azizpour, H., Sullivan, J., and Carlsson, S. (2014, January 23\u201328). CNN Features off-the-shelf: An astounding baseline for recognition. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Columbus, OH, USA.","DOI":"10.1109\/CVPRW.2014.131"},{"key":"ref_37","unstructured":"Babenko, A., and Lempitsky, V. (2015, January 13\u201316). Aggregating Deep Convolutional Features for Image Retrieval. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1145\/2766959","article-title":"Learning visual similarity for product design with convolutional neural networks","volume":"34","author":"Bell","year":"2015","journal-title":"ACM Trans. Graph."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Song, H.O., Xiang, Y., Jegelka, S., and Savarese, S. (July, January 26). Deep metric learning via lifted structured feature embedding. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.434"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Jegou, H., and Chum, O. (2012, January 7\u201313). Negative evidences and co-occurences in image retrieval: The benefit of pca and whitening. Pt II. Proceedings of the Computer Vision\u2014ECCV 2012, Florence, Italy.","DOI":"10.1007\/978-3-642-33709-3_55"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Imbriaco, R., Sebastian, C., and Bondarev, E. (2019). Aggregated Deep Local Features for Remote Sensing Image Retrieval. Remote Sens., 11.","DOI":"10.3390\/rs11050493"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1007\/s11263-012-0600-1","article-title":"Learning Vocabularies over a Fine Quantization","volume":"103","author":"Mikulik","year":"2013","journal-title":"Int. J. Comput. Vis."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1007\/s11263-017-1016-8","article-title":"End-to-End Learning of Deep Visual Representations for Image Retrieval","volume":"124","author":"Gordo","year":"2017","journal-title":"Int. J. Comput. Vis."}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/8\/9\/393\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:16:44Z","timestamp":1760188604000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/8\/9\/393"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,9,4]]},"references-count":43,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2019,9]]}},"alternative-id":["ijgi8090393"],"URL":"https:\/\/doi.org\/10.3390\/ijgi8090393","relation":{},"ISSN":["2220-9964"],"issn-type":[{"type":"electronic","value":"2220-9964"}],"subject":[],"published":{"date-parts":[[2019,9,4]]}}}