{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T02:51:42Z","timestamp":1772592702853,"version":"3.50.1"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,12,20]],"date-time":"2024-12-20T00:00:00Z","timestamp":1734652800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Italian MIUR within PRIN 2017, Project","award":["20172BH297"],"award-info":[{"award-number":["20172BH297"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2025,1,31]]},"abstract":"<jats:p>Recommending fashion items often leverages rich user profiles and makes targeted suggestions based on past history and previous purchases. In this paper, we work under the assumption that no prior knowledge is given about a user. We propose to build a user profile on the fly by integrating user reactions as we recommend complementary items to compose an outfit. We present a reinforcement learning agent capable of suggesting appropriate garments and ingesting user feedback so to improve its recommendations and maximize user satisfaction. To train such a model, we resort to a proxy model to be able to simulate having user feedback in the training loop. We experiment on the IQON3000 fashion dataset and we find that a reinforcement learning-based agent becomes capable of improving its recommendations by taking into account personal preferences. Furthermore, such task demonstrated to be hard for non-reinforcement models, that cannot exploit exploration during training.<\/jats:p>","DOI":"10.1145\/3702327","type":"journal-article","created":{"date-parts":[[2024,11,2]],"date-time":"2024-11-02T15:46:04Z","timestamp":1730562364000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Interactive Garment Recommendation with User in the Loop"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2537-2700","authenticated-orcid":false,"given":"Federico","family":"Becattini","sequence":"first","affiliation":[{"name":"University of Siena, Siena, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4638-0603","authenticated-orcid":false,"given":"Xiaolin","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-9250-9408","authenticated-orcid":false,"given":"Andrea","family":"Puccia","sequence":"additional","affiliation":[{"name":"University of Florence, Firenze, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0633-3722","authenticated-orcid":false,"given":"Haokun","family":"Wen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5274-4197","authenticated-orcid":false,"given":"Xuemeng","family":"Song","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1476-0273","authenticated-orcid":false,"given":"Liqiang","family":"Nie","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1052-8322","authenticated-orcid":false,"given":"Alberto","family":"Del Bimbo","sequence":"additional","affiliation":[{"name":"University of Florence, Firenze, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,12,20]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Lorenzo Agnolucci Alberto Baldrati Marco Bertini and Alberto Del Bimbo. 2024. iSEARLE: Improving textual inversion for zero-shot composed image retrieval. arXiv:2405.02951."},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","unstructured":"Alberto Baldrati Davide Morelli Marcella Cornia Marco Bertini and Rita Cucchiara. 2024. Multimodal-conditioned latent diffusion models for fashion image editing. arXiv:2403.14828.","DOI":"10.1109\/ICCV51070.2023.02138"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3552468.3555365"},{"issue":"24","key":"e_1_3_1_5_2","first-page":"1","article-title":"Fashion recommendation based on style and social events","volume":"82","author":"Becattini Federico","year":"2023","unstructured":"Federico Becattini, Lavinia De Divitiis, Claudio Baecchi, and Alberto Del Bimbo. 2023. Fashion recommendation based on style and social events. Multimedia Tools and Applications 82, 24 (2023), 1\u201316.","journal-title":"Multimedia Tools and Applications"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2022.3195063"},{"key":"e_1_3_1_7_2","first-page":"1","volume-title":"Proceedings of the International Conference on ACM Multimedia Asia","author":"Becattini Federico","year":"2021","unstructured":"Federico Becattini, Xuemeng Song, Claudio Baecchi, Shi-Ting Fang, Claudio Ferrari, Liqiang Nie, and Alberto Del Bimbo. 2021. PLM-IPE: A pixel-landmark mutual enhanced framework for implicit preference estimation. In Proceedings of the International Conference on ACM Multimedia Asia, 1\u20135."},{"key":"e_1_3_1_8_2","article-title":"Transformer-based graph neural networks for outfit generation","author":"Becattini Federico","year":"2023","unstructured":"Federico Becattini, Federico Maria Teotini, and Alberto Del Bimbo. 2023b. Transformer-based graph neural networks for outfit generation. IEEE Transactions on Emerging Topics in Computing (2023).","journal-title":"IEEE Transactions on Emerging Topics in Computing"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413530"},{"key":"e_1_3_1_10_2","first-page":"2998","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Chen Yanbei","year":"2020","unstructured":"Yanbei Chen, Shaogang Gong, and Loris Bazzani. 2020. Image search with text feedback by visiolinguistic attention learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2998\u20133008."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313444"},{"key":"e_1_3_1_12_2","first-page":"282","volume-title":"International conference on pattern recognition","author":"Divitiis Lavinia De","year":"2021","unstructured":"Lavinia De Divitiis, Federico Becattini, Claudio Baecchi, and Alberto Del Bimbo. 2021. Garment recommendation with memory augmented neural networks. In International conference on pattern recognition. Springer, 282\u2013295."},{"key":"e_1_3_1_13_2","first-page":"1","volume-title":"Proceedings of the International Conference on Content-Based Multimedia Indexing (CBMI)","author":"Divitiis Lavinia De","year":"2021","unstructured":"Lavinia De Divitiis, Federico Becattini, Claudio Baecchi, and Alberto Del Bimbo. 2021. Style-based outfit recommendation. In Proceedings of the International Conference on Content-Based Multimedia Indexing (CBMI). IEEE, 1\u20134."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3531017"},{"key":"e_1_3_1_15_2","first-page":"486","article-title":"A theoretical analysis of deep Q-learning","author":"Fan Jianqing","year":"2020","unstructured":"Jianqing Fan, Zhaoran Wang, Yuchen Xie, and Zhuoran Yang. 2020. A theoretical analysis of deep Q-learning. In Learning for Dynamics and Control. PMLR, 486\u2013489.","journal-title":"Learning for Dynamics and Control"},{"key":"e_1_3_1_16_2","first-page":"4600","volume-title":"Proceedings of the International ACM Conference on Multimedia","author":"Gu Chunbin","year":"2021","unstructured":"Chunbin Gu, Jiajun Bu, Zhen Zhang, Zhi Yu, Dongfang Ma, and Wei Wang. 2021. Image search with text feedback by deep hierarchical attention mutual information maximization. In Proceedings of the International ACM Conference on Multimedia. ACM, New York, NY, 4600\u20134609."},{"key":"e_1_3_1_17_2","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1145\/3503161.3548020","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Guan Weili","year":"2022","unstructured":"Weili Guan, Xuemeng Song, Haoyu Zhang, Meng Liu, Chung-Hsing Yeh, and Xiaojun Chang. 2022. Bi-directional heterogeneous graph hashing towards efficient outfit recommendation. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 268\u2013276."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3187290"},{"key":"e_1_3_1_19_2","first-page":"676","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Guo Xiaoxiao","year":"2018","unstructured":"Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, and Rog\u00e9rio Schmidt Feris. 2018. Dialog-based Interactive Image Retrieval. In Proceedings of the Conference on Neural Information Processing Systems, 676\u2013686."},{"key":"e_1_3_1_20_2","first-page":"1472","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Han Xintong","year":"2017","unstructured":"Xintong Han, Zuxuan Wu, Phoenix X. Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, and Larry S. Davis. 2017. Automatic spatially-aware fashion concept discovery. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 1472\u20131480."},{"key":"e_1_3_1_21_2","first-page":"1078","volume-title":"Proceedings of the International ACM Conference on Multimedia","author":"Han Xintong","year":"2017","unstructured":"Xintong Han, Zuxuan Wu, Yu-Gang Jiang, and Larry S. Davis. 2017b. Learning fashion compatibility with bidirectional LSTMs. In Proceedings of the International ACM Conference on Multimedia. ACM, New York, NY, 1078\u20131086."},{"key":"e_1_3_1_22_2","first-page":"802","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Lee Seungmin","year":"2021","unstructured":"Seungmin Lee, Dongwan Kim, and Bohyung Han. 2021. CoSMo: Content-style modulation for image retrieval with text feedback. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation\/IEEE, 802\u2013812."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992699"},{"key":"e_1_3_1_24_2","first-page":"2014","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Niepert Mathias","year":"2016","unstructured":"Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. In Proceedings of the International Conference on Machine Learning. JMLR.org, 2014\u20132023."},{"key":"e_1_3_1_25_2","unstructured":"Anwesan Pal Sahil Wadhwa Ayush Jaiswal Xu Zhang Yue Wu Rakesh Chada Pradeep Natarajan and Henrik I Christensen. 2023. FashionNTM: Multi-turn fashion image retrieval via cascaded memory. arXiv:2308.10170."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2023.06.018"},{"key":"e_1_3_1_27_2","unstructured":"Karin Sevegnani Arjun Seshadri Tian Wang Anurag Beniwal Julian J. McAuley Alan Lu and Gerard Medioni. 2022. Contrastive learning for interactive recommendation in fashion. arXiv:2207.12033. Retrieved from https:\/\/arxiv.org\/abs\/2207.12033"},{"key":"e_1_3_1_28_2","first-page":"272","volume-title":"Web-Age Information Management International Conference","author":"Sha Dandan","year":"2016","unstructured":"Dandan Sha, Daling Wang, Xiangmin Zhou, Shi Feng, Yifei Zhang, and Ge Yu. 2016. An approach for clothing recommendation based on multiple image attributes. In Web-Age Information Management International Conference. Springer, 272\u2013285."},{"key":"e_1_3_1_29_2","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1145\/3123266.3123314","volume-title":"Proceedings of the International ACM Conference on Multimedia","author":"Song Xuemeng","year":"2017","unstructured":"Xuemeng Song, Fuli Feng, Jinhuan Liu, Zekun Li, Liqiang Nie, and Jun Ma. 2017. NeuroStylist: Neural compatibility modeling for clothing matching. In Proceedings of the International ACM Conference on Multimedia. ACM, New York, NY, 753\u2013761."},{"key":"e_1_3_1_30_2","first-page":"320","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Song Xuemeng","year":"2019","unstructured":"Xuemeng Song, Xianjing Han, Yunkai Li, Jingyuan Chen, Xin-Shun Xu, and Liqiang Nie. 2019a. GP-BPR: Personalized compatibility modeling for clothing matching. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 320\u2013328."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350956"},{"key":"e_1_3_1_32_2","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton Richard S.","year":"2018","unstructured":"Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press."},{"key":"e_1_3_1_33_2","first-page":"405","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Vasileva Mariya I.","year":"2018","unstructured":"Mariya I. Vasileva, Bryan A. Plummer, Krishna Dusad, Shreya Rajpal, Ranjitha Kumar, and David A. Forsyth. 2018. Learning type-aware embeddings for fashion compatibility. In Proceedings of the European Conference on Computer Vision. Springer, 405\u2013421."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00660"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462967"},{"key":"e_1_3_1_36_2","first-page":"915","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Wen Haokun","year":"2023","unstructured":"Haokun Wen, Xian Zhang, Xuemeng Song, Yinwei Wei, and Liqiang Nie. 2023. Target-guided composed image retrieval. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 915\u2013923."},{"key":"e_1_3_1_37_2","first-page":"11307","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Wu Hui","year":"2021","unstructured":"Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, and Rog\u00e9rio Feris. 2021. Fashion IQ: A new dataset towards retrieving images by natural language feedback. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 11307\u201311317."},{"key":"e_1_3_1_38_2","first-page":"7309","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wu Wei","year":"2022","unstructured":"Wei Wu, Jiawei Liu, Kecheng Zheng, Qibin Sun, and Zhengjun Zha. 2022a. Temporal complementarity-guided reinforcement learning for image-to-video person re-identification. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 7309\u20137318."},{"key":"e_1_3_1_39_2","first-page":"124","volume-title":"Proceedings of the ACM Conference on Recommender Systems","author":"Wu Yaxiong","year":"2022","unstructured":"Yaxiong Wu, Craig Macdonald, and Iadh Ounis. 2022b. Multi-modal dialog state tracking for interactive fashion recommendation. In Proceedings of the ACM Conference on Recommender Systems. ACM, New York, NY, 124\u2013133."},{"key":"e_1_3_1_40_2","first-page":"931","volume-title":"Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Xin Xin","year":"2020","unstructured":"Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, and Joemon M. Jose. 2020. Self-Supervised reinforcement learning for recommender systems. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 931\u2013940."},{"key":"e_1_3_1_41_2","first-page":"8806","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Yanai Chen","year":"2022","unstructured":"Chen Yanai, Adir Solomon, Gilad Katz, Bracha Shapira, and Lior Rokach. 2022. Q-Ball: Modeling basketball games using deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 8806\u20138813."},{"key":"e_1_3_1_42_2","first-page":"649","volume-title":"Proceedings of the World Wide Web Conference on World Wide Web","author":"Yu Wenhui","year":"2018","unstructured":"Wenhui Yu, Huidi Zhang, Xiangnan He, Xu Chen, Li Xiong, and Zheng Qin. 2018. Aesthetic-based clothing recommendation. In Proceedings of the World Wide Web Conference on World Wide Web. ACM, New York, NY, 649\u2013658."},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462881"},{"key":"e_1_3_1_44_2","first-page":"5353","volume-title":"Proceedings of the International ACM Conference on Multimedia","author":"Zhang Gangjian","year":"2021","unstructured":"Gangjian Zhang, Shikui Wei, Huaxin Pang, and Yao Zhao. 2021. Heterogeneous feature fusion and cross-modal alignment for composed image retrieval. In Proceedings of the International ACM Conference on Multimedia. ACM, New York, NY, 5353\u20135362."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.652"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3339363.3339370"},{"key":"e_1_3_1_47_2","first-page":"167","volume-title":"Proceedings of the World Wide Web Conference on World Wide Web","author":"Zheng Guanjie","year":"2018","unstructured":"Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the World Wide Web Conference on World Wide Web. ACM, New York, NY, 167\u2013176."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702327","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3702327","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:08Z","timestamp":1750295888000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702327"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,20]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1,31]]}},"alternative-id":["10.1145\/3702327"],"URL":"https:\/\/doi.org\/10.1145\/3702327","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,20]]},"assertion":[{"value":"2023-11-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-13","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}