{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,12]],"date-time":"2026-01-12T17:42:33Z","timestamp":1768239753208,"version":"3.49.0"},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"1","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["No. T2322012, No. 62172218"],"award-info":[{"award-number":["No. T2322012, No. 62172218"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"RGC of the Hong Kong Special Administrative Region, China","award":["UGC\/FDS16\/E17\/23"],"award-info":[{"award-number":["UGC\/FDS16\/E17\/23"]}]},{"name":"Direct Grant","award":["DR25E8"],"award-info":[{"award-number":["DR25E8"]}]},{"name":"Faculty Research Grants of Lingnan University, Hong Kong","award":["SDS24A8 and SDS24A19"],"award-info":[{"award-number":["SDS24A8 and SDS24A19"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2026,1,31]]},"abstract":"<jats:p>\n                    Searching by image is popular yet still challenging in e-commerce due to the extensive interference arising from (i) data variations (e.g., background, pose, visual angle, brightness) of real-world captured images and (ii) similar images in the query dataset. This article studies a practically meaningful problem of beauty product retrieval (BPR) by neural networks. We broadly extract different types of image features and raise an intriguing question that whether these features are beneficial to (i) suppress data variations of real-world captured images and (ii) distinguish one image from others which look very similar but are intrinsically different beauty products in the dataset, therefore leading to an enhanced capability of BPR. To answer it, we present a novel\n                    <jats:italic toggle=\"yes\">v<\/jats:italic>\n                    ariable-attention neural network to understand the combination of\n                    <jats:italic toggle=\"yes\">m<\/jats:italic>\n                    ultiple features (termed VM-Net) of beauty product images. Considering that there are few publicly released training datasets for BPR, we establish a new dataset with more than one million images classified into more than 20K categories to improve both the generalization and anti-interference abilities of VM-Net and other methods. We verify the performance of VM-Net and its competitors on the benchmark dataset Perfect-500K, where VM-Net shows clear improvements over the competitors in terms of\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(MAP@7\\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    . The source code and dataset will be released upon publication.\n                  <\/jats:p>","DOI":"10.1145\/3773765","type":"journal-article","created":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T14:06:16Z","timestamp":1761746776000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Search by Image: Deeply Exploring Beneficial Features for Beauty Product Retrieval"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0429-490X","authenticated-orcid":false,"given":"Mingqiang","family":"Wei","sequence":"first","affiliation":[{"name":"Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5384-7411","authenticated-orcid":false,"given":"Qian","family":"Sun","sequence":"additional","affiliation":[{"name":"Nanjing University of Aeronautics and Astronautics, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0965-3617","authenticated-orcid":false,"given":"Haoran","family":"Xie","sequence":"additional","affiliation":[{"name":"Lingnan University, Hong Kong, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2784-3449","authenticated-orcid":false,"given":"Dong","family":"Liang","sequence":"additional","affiliation":[{"name":"Nanjing University of Aeronautics and Astronautics, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2447-4828","authenticated-orcid":false,"given":"Dingkun","family":"Zhu","sequence":"additional","affiliation":[{"name":"Jiangsu University of Technology, Changzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3976-0053","authenticated-orcid":false,"given":"Fu Lee","family":"Wang","sequence":"additional","affiliation":[{"name":"Hong Kong Metropolitan University, Hong Kong, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,1,12]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"ACM Multimedia. 2020. The Contest of AI Meets Beauty. Retrieved from https:\/\/challenge2020.perfectcorp.com\/"},{"key":"e_1_3_1_3_2","unstructured":"Alibaba 2020. The Contest of Product Identification Competition. Retrieved from https:\/\/tianchi.aliyun.com\/competition\/entrance\/231772\/introduction"},{"key":"e_1_3_1_4_2","first-page":"333","volume-title":"Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing","author":"Auli Michael","year":"2011","unstructured":"Michael Auli and Adam Lopez. 2011. Training a log-linear parser with loss functions via softmax-margin. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 333\u2013343."},{"key":"e_1_3_1_5_2","first-page":"7999","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Brattoli Biagio","year":"2019","unstructured":"Biagio Brattoli, Karsten Roth, and Bj\u00f6rn Ommer. 2019. MIC: Mining interclass characteristics for improved metric learning. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, 7999\u20138008."},{"key":"e_1_3_1_6_2","first-page":"677","volume-title":"Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920)","volume":"12354","author":"Brown Andrew","year":"2020","unstructured":"Andrew Brown, Weidi Xie, Vicky Kalogeiton, and Andrew Zisserman. 2020. Smooth-AP: Smoothing the path towards large-scale image retrieval. In Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920), Lecture Notes in Computer Science, Vol. 12354, 677\u2013694."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/1348246.1348248"},{"key":"e_1_3_1_8_2","first-page":"248","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition","author":"Deng Jia","year":"2009","unstructured":"Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 248\u2013255."},{"key":"e_1_3_1_9_2","first-page":"8029","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Fang Pengfei","year":"2019","unstructured":"Pengfei Fang, Jieming Zhou, Soumava Kumar Roy, Lars Petersson, and Mehrtash Harandi. 2019. Bilinear attention networks for person retrieval. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, 8029\u20138038."},{"key":"e_1_3_1_10_2","first-page":"369","volume-title":"Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920)","volume":"12349","author":"Ge Yixiao","year":"2020","unstructured":"Yixiao Ge, Haibo Wang, Feng Zhu, Rui Zhao, and Hongsheng Li. 2020. Self-supervising fine-grained region similarities for large-scale image localization. In Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920), Lecture Notes in Computer Science, Vol. 12349, 369\u2013386."},{"key":"e_1_3_1_11_2","first-page":"3417","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Jang Young Kyun","year":"2020","unstructured":"Young Kyun Jang and Nam Ik Cho. 2020. Generalized product quantization network for Semi-Supervised image retrieval. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 3417\u20133426."},{"key":"e_1_3_1_12_2","first-page":"5364","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Lee Seongwon","year":"2022","unstructured":"Seongwon Lee, Hongje Seong, Suhyeon Lee, and Euntai Kim. 2022. Correlation verification for image retrieval. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 5364\u20135374."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.106"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3356055"},{"key":"e_1_3_1_15_2","first-page":"253","volume-title":"Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920)","volume":"12370","author":"Ng Tony","year":"2020","unstructured":"Tony Ng, Vassileios Balntas, Yurun Tian, and Krystian Mikolajczyk. 2020. SOLAR: Second-order loss and attention for image retrieval. In Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920), Lecture Notes in Computer Science, Vol. 12370, 253\u2013270."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.690"},{"key":"e_1_3_1_17_2","unstructured":"Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. arXiv:1804.02767. Retrieved from https:\/\/arxiv.org\/abs\/804.02767"},{"key":"e_1_3_1_18_2","first-page":"5106","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Revaud J\u00e9r\u00f4me","year":"2019","unstructured":"J\u00e9r\u00f4me Revaud, Jon Almaz\u00e1n, Rafael S. Rezende, and C\u00e9sar Roberto de Souza. 2019. Learning with average precision: Training image retrieval with a listwise loss. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, 5106\u20135115."},{"key":"e_1_3_1_19_2","first-page":"6397","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Sun Yifan","year":"2020","unstructured":"Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, and Yichen Wei. 2020. Circle loss: A unified perspective of pair similarity optimization. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 6397\u20136406."},{"key":"e_1_3_1_20_2","volume-title":"Proceedings of the 4th International Conference on Learning Representations","author":"Tolias Giorgos","year":"2016","unstructured":"Giorgos Tolias, Ronan Sicre, and Herv\u00e9 J\u00e9gou. 2016. Particular object retrieval with integral max-pooling of CNN activations. In Proceedings of the 4th International Conference on Learning Representations."},{"key":"e_1_3_1_21_2","doi-asserted-by":"crossref","first-page":"2548","DOI":"10.1145\/3343031.3356059","volume-title":"Proceedings of the 27th ACM International Conference on Multimedia","author":"Wang Jiawei","year":"2019","unstructured":"Jiawei Wang, Shuai Zhu, Jiao Xu, and Da Cao. 2019. The retrieval of the beautiful: Self-Supervised salient object detection for beauty product retrieval. In Proceedings of the 27th ACM International Conference on Multimedia, 2548\u20132552."},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-019-2721-0"},{"key":"e_1_3_1_23_2","first-page":"13022","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wei Jun","year":"2020","unstructured":"Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, and Qi Tian. 2020. Label decoupling framework for salient object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 13022\u201313031."},{"key":"e_1_3_1_24_2","volume-title":"Proceedings of the 10th International Conference on Learning Representations","author":"Weinzaepfel Philippe","year":"2022","unstructured":"Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, and Yannis Kalantidis. 2022. Learning super-features for image retrieval. In Proceedings of the 10th International Conference on Learning Representations."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.3233\/JIFS-179958"},{"key":"e_1_3_1_26_2","volume-title":"Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920)","author":"Xuan Hong","year":"2020","unstructured":"Hong Xuan, Abby Stylianou, Xiaotong Liu, and Robert Pless. 2020. Hard negative examples are hard, but useful. In Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920), Lecture Notes in Computer Science, Vol. 12359, 126\u2013142."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3416272"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3416289"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3356065"},{"key":"e_1_3_1_30_2","doi-asserted-by":"crossref","first-page":"2558","DOI":"10.1145\/3343031.3356075","volume-title":"Proceedings of the 27th ACM International Conference on Multimedia","author":"Zhang Yi","year":"2019","unstructured":"Yi Zhang, Linzi Qu, Lihuo He, Wen Lu, and Xinbo Gao. 2019. Beauty aware network: An unsupervised method for makeup product retrieval. In Proceedings of the 27th ACM International Conference on Multimedia, 2558\u20132562."},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","first-page":"1224","DOI":"10.1109\/TPAMI.2017.2709749","article-title":"SIFT meets CNN: A decade survey of instance retrieval","volume":"5","author":"Zheng Liang","year":"2018","unstructured":"Liang Zheng, Yi Yang, and Qi Tian. 2018. SIFT meets CNN: A decade survey of instance retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 5 (2018), 1224\u20131244.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3773765","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,12]],"date-time":"2026-01-12T14:32:30Z","timestamp":1768228350000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3773765"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,12]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,31]]}},"alternative-id":["10.1145\/3773765"],"URL":"https:\/\/doi.org\/10.1145\/3773765","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,12]]},"assertion":[{"value":"2023-02-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-02","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}