{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T18:18:51Z","timestamp":1772043531683,"version":"3.50.1"},"reference-count":83,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T00:00:00Z","timestamp":1762300800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Integrating recommendation systems with dynamic pricing strategies is essential for enhancing product sales and optimizing revenue in modern business. This study proposes a novel product recommendation model that uses Reinforcement Learning to tailor pricing strategies to customer purchase intentions. While traditional recommendation systems focus on identifying products customers prefer, they often neglect the critical factor of pricing. To improve effectiveness and increase conversion, it is crucial to personalize product prices according to the customer\u2019s willingness to pay (WTP). Businesses often use fixed-budget promotions to boost sales, emphasizing the importance of strategic pricing. Designing intelligent promotions requires recommending products aligned with customer preferences and setting prices reflecting their WTP, thus increasing the likelihood of purchase. This research advances existing recommendation systems by integrating dynamic pricing into the system\u2019s output, offering a significant innovation in business practice. However, this integration introduces technical complexities, which are addressed through a Markov Decision Process (MDP) framework and solved using Reinforcement Learning. Empirical evaluation using the Dunnhumby dataset shows promising results. Due to the lack of direct comparisons between combined product recommendation and pricing models, the outputs were simplified into two categories: purchase and non-purchase. This approach revealed significant improvements over comparable methods, demonstrating the model\u2019s efficacy.<\/jats:p>","DOI":"10.3390\/a18110706","type":"journal-article","created":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T17:06:06Z","timestamp":1762362366000},"page":"706","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Product Recommendation with Price Personalization According to Customer\u2019s Willingness to Pay Using Deep Reinforcement Learning"],"prefix":"10.3390","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-2922-6953","authenticated-orcid":false,"given":"Ali","family":"Mahdavian","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Tehran, Tehran 14179-35840, Iran"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4916-9408","authenticated-orcid":false,"given":"Hadi","family":"Moradi","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Tehran, Tehran 14179-35840, Iran"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4429-2511","authenticated-orcid":false,"given":"Behnam","family":"Bahrak","sequence":"additional","affiliation":[{"name":"Tehran Institute for Advanced Studies, Khatam University, Tehran 19916-33357, Iran"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,5]]},"reference":[{"key":"ref_1","unstructured":"StartUp Guru Lab (2025, August 06). Customer Acquisition vs Retention Costs. Available online: https:\/\/startupgurulab.com\/customer-acquisition-vs-retention-costs."},{"key":"ref_2","unstructured":"Optimove (2025, August 06). Customer Acquisition vs Retention Costs: Why Retention is More Profitable. Available online: https:\/\/www.optimove.com\/resources\/learning-center\/customer-acquisition-vs-retention-costs."},{"key":"ref_3","unstructured":"Gallo, A. (2025, August 06). The Value of Keeping the Right Customers. Harvard Business Review. Available online: https:\/\/hbr.org\/2014\/10\/the-value-of-keeping-the-right-customers."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Chen, K., Chen, T., Zheng, G., Jin, O., Yao, E., and Yu, Y. (2012, January 12\u201316). Collaborative personalized tweet recommendation. Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, OR, USA.","DOI":"10.1145\/2348283.2348372"},{"key":"ref_5","first-page":"102783","article-title":"AI-powered marketing: What, where, and how?","volume":"77","author":"Kumar","year":"2024","journal-title":"Int. J. Inf. Manag."},{"key":"ref_6","first-page":"102716","article-title":"Generative artificial intelligence in marketing: Applications, opportunities, challenges, and research agenda","volume":"75","author":"Kshetri","year":"2024","journal-title":"Int. J. Inf. Manag."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"120864","DOI":"10.1016\/j.eswa.2023.120864","article-title":"Dynamic discount pricing in online retail systems: Effects of post-discount dynamic forces","volume":"232","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"124553","DOI":"10.1016\/j.eswa.2024.124553","article-title":"STAD-GCN: Spatial\u2013Temporal Attention-based Dynamic Graph Convolutional Network for retail market price prediction","volume":"255","author":"Kim","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_9","first-page":"101","article-title":"Developing customer product loyalty through mobile advertising: Affective and cognitive perspectives","volume":"47","author":"Lu","year":"2019","journal-title":"Int. J. Inf. Manag."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3190616","article-title":"Sequence-aware recommender systems","volume":"51","author":"Quadrana","year":"2018","journal-title":"Acm Comput. Surv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.neucom.2021.11.041","article-title":"A survey of recommender systems with multi-objective optimization","volume":"474","author":"Zheng","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1142\/S0219622012500289","article-title":"Optimization of online promotion: A profit-maximizing model integrating price discount and product recommendation","volume":"11","author":"Jiang","year":"2012","journal-title":"Int. J. Inf. Technol. Decis. Mak."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"120131","DOI":"10.1016\/j.eswa.2023.120131","article-title":"A systematic review of value-aware recommender systems","volume":"226","author":"Montagna","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_14","unstructured":"Cantador, I., Bellog\u00edn, A., and Castells, P. In Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems, Chicago, IL, USA, 27 October 2011."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3543846","article-title":"Reinforcement Learning based Recommender Systems: A Survey","volume":"55","author":"Afsar","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_16","unstructured":"Lin, Y., Liu, Y., Lin, F., Zou, L., Wu, P., Zeng, W., Chen, H., and Miao, C. (2021). A Survey on Reinforcement Learning for Recommender Systems. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"110335","DOI":"10.1016\/j.knosys.2023.110335","article-title":"Deep reinforcement learning in recommender systems: A survey and new perspectives","volume":"264","author":"Chen","year":"2023","journal-title":"Knowl.-Based Syst."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"125238","DOI":"10.1016\/j.eswa.2024.125238","article-title":"A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation","volume":"259","author":"Wen","year":"2025","journal-title":"Expert Syst. Appl."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wang, J., Karatzoglou, A., Arapakis, I., and Jose, J.M. (2024, January 14\u201318). Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR \u201924), Washington, DC, USA.","DOI":"10.1145\/3626772.3657767"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"128570","DOI":"10.1016\/j.eswa.2025.128570","article-title":"Dynamic pricing and inventory control of perishable products by a deep reinforcement learning algorithm","volume":"291","author":"Kavoosi","year":"2025","journal-title":"Expert Syst. Appl."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"103167","DOI":"10.1016\/j.inffus.2025.103167","article-title":"PGA-DRL: Progressive graph attention-based deep reinforcement learning for recommender systems","volume":"121","author":"Tanveer","year":"2025","journal-title":"Inf. Fusion"},{"key":"ref_22","first-page":"59","article-title":"Customer behavior in the prior purchase stage \u2013 information search versus recommender systems","volume":"54","author":"Mican","year":"2020","journal-title":"Econ. Comput. Econ. Cybern. Stud. Res."},{"key":"ref_23","first-page":"150","article-title":"The role of online product recommendations on customer decision making and loyalty in social shopping communities","volume":"38","author":"Zhang","year":"2018","journal-title":"Int. J. Inf. Manag."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ramampiaro, H.e.a. (2019). New Ideas in Ranking for Personalized Fashion Recommender Systems. Business and Consumer Analytics: New Ideas, Springer.","DOI":"10.1007\/978-3-030-06222-4_25"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.7903\/ijecs.1348","article-title":"Developing a price-sensitive recommender system to improve accuracy and business performance of ecommerce applications","volume":"6","author":"Umberto","year":"2015","journal-title":"Int. J. Electron. Commer. Stud."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"103525","DOI":"10.1016\/j.ipm.2023.103525","article-title":"GPR-OPT: A Practical Gaussian optimization criterion for implicit recommender systems","volume":"61","author":"Bai","year":"2024","journal-title":"Inf. Process. Manag."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"121072","DOI":"10.1016\/j.eswa.2023.121072","article-title":"Modeling users\u2019 preference changes in recommender systems via time-dependent Markov random fields","volume":"234","author":"Pujahari","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Beheshti, A., Yakhchi, S., Mousaeirad, S., Ghafari, S.M., Goluguri, S.R., and Edrisi, M.A. (2020). Towards Cognitive Recommender Systems. Algorithms, 13.","DOI":"10.3390\/a13080176"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1146\/annurev-psych-010213-115043","article-title":"Emotion and decision making","volume":"66","author":"Lerner","year":"2015","journal-title":"Annu. Rev. Psychol."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Alfaifi, Y.H. (2024). Recommender Systems Applications: Data Sources, Features, and Challenges. Information, 15.","DOI":"10.3390\/info15100660"},{"key":"ref_31","unstructured":"Livne, A., Unger, M., Shapira, B., and Rokach, L. Deep Context-Aware Recommender System Utilizing Sequential Latent Context. Proceedings of the 13th ACM Conference on Recommender Systems\u2014CARS Workshop, Copenhagen, Denmark."},{"key":"ref_32","unstructured":"Zhao, W.X., Guo, Y., He, Y., Jiang, H., Wu, Y., and Li, X. We Know What You Want to Buy: A Demographic-Based System for Product Recommendation on Microblogs. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD\u201914), New York, NY, USA."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Tkal\u010di\u010d, M., De Carolis, B., de Gemmis, M., Odi\u0107, A., and Ko\u0161ir, A. (2016). Emotions and Personality in Personalized Services: Models, Evaluation and Applications, Springer.","DOI":"10.1007\/978-3-319-31413-6"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ricci, F., Rokach, L., and Shapira, B. (2011). Introduction to recommender systems handbook. Recommender Systems Handbook, Springer.","DOI":"10.1007\/978-0-387-85820-3"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1016\/j.knosys.2016.08.013","article-title":"Recommender Systems for Product Bundling","volume":"111","author":"Beladev","year":"2016","journal-title":"Knowl.-Based Syst."},{"key":"ref_36","unstructured":"Kouki, P., Fountalis, I., Vasiloglou, N., Yan, N., Ahsan, U., Al Jadda, K., and Qu, H. Product Collection Recommendation in Online Retail. Proceedings of the 13th ACM Conference on Recommender Systems (RecSys \u201919), Copenhagen, Denmark."},{"key":"ref_37","unstructured":"Wu, S., Sun, F., Zhang, W., Xie, X., and Cui, B. (2020). Graph Neural Networks in Recommender Systems: A Survey. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"103956","DOI":"10.1016\/j.ipm.2024.103956","article-title":"Multi-view graph contrastive representation learning for bundle recommendation","volume":"62","author":"Zhang","year":"2025","journal-title":"Inf. Process. Manag."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1007\/s10844-018-0542-3","article-title":"A Survey on Group Recommender Systems","volume":"54","author":"Dara","year":"2019","journal-title":"J. Intell. Inf. Syst."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.elerap.2018.01.012","article-title":"Recommendation system development for fashion retail e-commerce","volume":"28","author":"Hwangbo","year":"2018","journal-title":"Electron. Commer. Res. Appl."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1109\/MIS.2021.3092768","article-title":"Exploring Customer Price Preference and Product Profit Role in Recommender Systems","volume":"37","author":"Kompan","year":"2022","journal-title":"IEEE Intell. Syst."},{"key":"ref_42","first-page":"102061","article-title":"Beyond user experience: What constitutes algorithmic experiences?","volume":"52","author":"Shin","year":"2020","journal-title":"Int. J. Inf. Manag."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1108\/K-03-2019-0199","article-title":"A new model for assessing the role of customer behavior history, product classification, and prices on the success of the recommender systems in e-commerce","volume":"49","author":"Wakil","year":"2020","journal-title":"Kybernetes"},{"key":"ref_44","unstructured":"Zheng, Y., Gao, C., He, X., Li, Y., and Jin, D. (2020, January 20\u201324). Price-aware recommendation with graph convolutional networks. Proceedings of the Proceedings\u2014International Conference on Data Engineering, Dallas, TX, USA."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1016\/j.ipm.2007.01.020","article-title":"Willingness to pay and experienced utility as measures of affective value of information objects: Users\u2019 accounts","volume":"44","author":"Lopatovska","year":"2008","journal-title":"Inf. Process. Manag."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Chen, J., Jin, Q., Zhao, S., Bao, S., Zhang, L., Su, Z., and Yu, Y. (2014, January 6\u201311). Does product recommendation meet its Waterloo in unexplored categories? No, price comes to help. Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, Gold Coast, QLD, Australia.","DOI":"10.1145\/2600428.2609608"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1016\/S0022-4359(99)80100-6","article-title":"The Effects of Framing Price Promotion Messages on Consumers\u2019 Perceptions and Purchase Intentions","volume":"74","author":"Kent","year":"1998","journal-title":"J. Retail."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.knosys.2018.02.026","article-title":"Personal price aware multi-seller recommender system: Evidence from eBay","volume":"150","author":"Rokach","year":"2018","journal-title":"Knowl.-Based Syst."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1002\/dir.20067","article-title":"Price customization using price thresholds estimated from scanner panel data","volume":"20","author":"Terui","year":"2006","journal-title":"J. Interact. Mark."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Sato, M., Izumo, H., and Sonoda, T. (2015, January 16\u201320). Discount sensitive recommender system for retail business. Proceedings of the EMPIRE \u201915: 3rd Workshop on Emotions and Personality in Personalized Systems 2015, Vienna, Austria.","DOI":"10.1145\/2809643.2809646"},{"key":"ref_51","unstructured":"Jannach, D., and Adomavicius, G. (2017). Price and Profit Awareness in Recommender Systems. arXiv."},{"key":"ref_52","unstructured":"Sato, M., Izumo, H., and Sonoda, T. (2025, October 28). Model of Personal Discount Sensitivity in Recommender Systems. arXiv, Available online: https:\/\/ixdea.org\/28_6\/."},{"key":"ref_53","unstructured":"Das, A., Mathieu, C., and Ricketts, D. (2009). Maximizing profit using recommender systems. arXiv."},{"key":"ref_54","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press. [2nd ed.]."},{"key":"ref_55","unstructured":"LiChun, C., and ZhiMin, Z. (2019, January 19\u201321). An overview of deep reinforcement learning. Proceedings of the 2019 4th International Conference on Automation, Control and Robotics Engineering, Shenzhen, China."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"125834","DOI":"10.1016\/j.eswa.2024.125834","article-title":"Multi-objective optimization approach for permanent magnet machine via improved soft actor\u2013critic based on deep reinforcement learning","volume":"264","author":"Wang","year":"2025","journal-title":"Expert Syst. Appl."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Xin, X., Karatzoglou, A., Arapakis, I., and Jose, J.M. (2020, January 25\u201330). Self-supervised reinforcement learning for recommender systems. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual.","DOI":"10.1145\/3397271.3401147"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Wu, Y., MacDonald, C., and Ounis, I. (October, January 27). Partially observable reinforcement learning for dialog-based interactive recommendation. Proceedings of the RecSys 2021\u201415th ACM Conference on Recommender Systems, Amsterdam, The Netherlands.","DOI":"10.1145\/3460231.3474256"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"121648","DOI":"10.1016\/j.ins.2024.121648","article-title":"Deep reinforcement learning-guided coevolutionary algorithm for constrained multiobjective optimization","volume":"692","author":"Luo","year":"2025","journal-title":"Inf. Sci."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Lei, Y., Wang, Z., Li, W., and Pei, H. (2019, January 21\u201325). Social attentive deep Q-network for recommendation. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France.","DOI":"10.1145\/3331184.3331302"},{"key":"ref_61","unstructured":"Farris, L. (2021). Optimized recommender systems with deep reinforcement learning. arXiv."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep reinforcement learning with double Q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_63","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing Atari with deep reinforcement learning. arXiv."},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Li, J., and Chen, B. A Deep Q-Learning Optimization Framework for Dynamic Pricing in E-Commerce. Proceedings of the 2025 4th International Conference on Cyber Security, Artificial Intelligence and the Digital Economy (CSAIDE 2025), Kuala Lumpur, Malaysia.","DOI":"10.1145\/3729706.3729764"},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"100139","DOI":"10.1016\/j.segy.2024.100139","article-title":"Deep Reinforcement Learning Based Dynamic Pricing for Demand Response Considering Market and Supply Constraints","volume":"14","author":"Fraija","year":"2024","journal-title":"Smart Energy"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Nomura, Y., Liu, Z., and Nishi, T. (2025). Deep Reinforcement Learning for Dynamic Pricing and Ordering Policies in Perishable Inventory Management. Appl. Sci., 15.","DOI":"10.3390\/app15052421"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Wang, G., Ding, J., and Hu, F. (2024). Deep Reinforcement Learning Recommendation System Algorithm Based on Multi-Level Attention Mechanisms. Electronics, 13.","DOI":"10.3390\/electronics13234625"},{"key":"ref_68","unstructured":"Rossiiev, O.D., Shapovalova, N.N., Rybalchenko, O.H., and Striuk, A.M. (2025, January 27). A Comprehensive Survey on Reinforcement Learning-based Recommender Systems: State-of-the-Art, Challenges, and Future Perspectives. Proceedings of the 7th Workshop for Young Scientists in Computer Science and Software Engineering (CSSE@SW 2024). CEUR Workshop Proceedings, Kryvyi Rih, Ukraine."},{"key":"ref_69","unstructured":"Gupta, R., Lin, J., and Meng, F. (2025). Multi-agent Deep Reinforcement Learning for Interdependent Pricing in Supply Chains. arXiv."},{"key":"ref_70","unstructured":"Zhang, D., Zhao, Y., and Sun, L. (2024, January 14\u201318). Reinforcement Learning in Recommender Systems: Progress, Challenges, and Opportunities. Proceedings of the 18th ACM Conference on Recommender Systems (RecSys), Bari, Italy."},{"key":"ref_71","unstructured":"Dulac-Arnold, G., Evans, R., van Hasselt, H., Sunehag, P., Lillicrap, T., Hunt, J., Mann, T., Weber, T., Degris, T., and Coppin, B. (2015). Deep Reinforcement Learning in Large Discrete Action Spaces. arXiv."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Liu, S., Cai, Q., Sun, B., Wang, Y., Jiang, J., Zheng, D., Gai, K., Jiang, P., Zhao, X., and Zhang, Y. (2023). Exploration and Regularization of the Latent Action Space in Recommendation. arXiv.","DOI":"10.1145\/3543507.3583244"},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"102982","DOI":"10.1016\/j.ipm.2022.102982","article-title":"Interest Evolution-driven Gated Neighborhood aggregation representation for dynamic recommendation in e-commerce","volume":"59","author":"Liu","year":"2022","journal-title":"Inf. Process. Manag."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Wang, K., Zou, Z., Zhao, M., Deng, Q., Shang, Y., Liang, Y., Wu, R., Shen, X., Lyu, T., and Fan, C. (2023, January 23\u201327). Rl4rs: A real-world dataset for reinforcement learning based recommender system. Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan.","DOI":"10.1145\/3539618.3591899"},{"key":"ref_75","first-page":"913","article-title":"XSimGCL: Towards Extremely Simple Graph Contrastive Learning for Recommendation","volume":"36","author":"Yu","year":"2023","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Wang, W., Xu, Y., Feng, F., Lin, X., He, X., and Chua, T. (2023, January 23\u201327). Diffusion Recommender Model. Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR \u201923), Taipei, Taiwan.","DOI":"10.1145\/3539618.3591663"},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Zhao, X., Xia, L., Zhang, L., Ding, Z., Yin, D., and Tang, J. (2018, January 2\u20137). Deep Reinforcement Learning for Page-wise Recommendations. Proceedings of the Twelfth ACM Conference on Recommender Systems (RecSys \u201918), Vancouver, BC, Canada.","DOI":"10.1145\/3240323.3240374"},{"key":"ref_78","unstructured":"Grandini, M., Bagli, E., and Visani, G. (2020). Metrics for Multi-Class Classification: An Overview. arXiv."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1023\/A:1010920819831","article-title":"A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems","volume":"45","author":"Hand","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_80","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2016, January 2\u20134). Continuous Control with Deep Reinforcement Learning. Proceedings of the International Conference on Learning Representations (ICLR). International Conference on Learning Representations, San Juan, CA, USA."},{"key":"ref_81","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden."},{"key":"ref_82","unstructured":"Ng, A.Y., and Russell, S.J. (July, January 29). Algorithms for Inverse Reinforcement Learning. Proceedings of the Seventeenth International Conference on Machine Learning (ICML), Stanford, CA, USA."},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1613\/jair.3987","article-title":"A Survey of Multi-Objective Sequential Decision-Making","volume":"48","author":"Roijers","year":"2013","journal-title":"J. Artif. Intell. Res."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/11\/706\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T17:12:28Z","timestamp":1762362748000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/11\/706"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,5]]},"references-count":83,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["a18110706"],"URL":"https:\/\/doi.org\/10.3390\/a18110706","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,5]]}}}