{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T14:15:39Z","timestamp":1781532939783,"version":"3.54.5"},"reference-count":62,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T00:00:00Z","timestamp":1776038400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"publisher","award":["2022YFF0707003"],"award-info":[{"award-number":["2022YFF0707003"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"China University Industry-Academia-Research Innovation Funds","award":["2023RY020"],"award-info":[{"award-number":["2023RY020"]}]},{"name":"Gansu Province Science and Technology Major Project\u2014Industrial Project","award":["23ZDGA006"],"award-info":[{"award-number":["23ZDGA006"]}]},{"name":"Supercomputing Center of Lanzhou University"},{"id":[{"id":"https:\/\/ror.org\/01mkqqe32","id-type":"ROR","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Optimizing long-term user satisfaction in sequential recommender systems is a critical challenge. Offline reinforcement learning (RL) offers a promising solution by learning recommendation policies from historical interaction logs without incurring the high costs of online exploration. However, offline RL suffers from severe distribution shift: the learned policy often overestimates the value of out-of-distribution (OOD) items, leading to unreliable recommendations and compromising user satisfaction. To address this issue, we propose a novel framework known as the Q-Learning Regularized Decision Transformer (QRDT). Built upon the Decision Transformer architecture, QRDT models recommendations as a sequence prediction task to capture complex user interest dynamics. To mitigate distribution shift, the QRDT integrates Kullback\u2013Leibler (KL) divergence and maximum entropy regularization into the Q-value function, enabling conservative long-term value estimation while encouraging diverse exploration within the logged data distribution. Extensive experiments on four real-world Amazon e-commerce datasets (CDs, Clothing, Cellphones, and Beauty) demonstrate that the QRDT achieves competitive performance and outperforms the PGPR baseline in most scenarios. Specifically, the proposed method yields improvements of 2.99% in Hit Rate (HR), 2.19% in Normalized Discounted Cumulative Gain (NDCG), 0.94% in Recall, and 0.84% in Precision, verifying the effectiveness of our regularization approach.<\/jats:p>","DOI":"10.3390\/info17040364","type":"journal-article","created":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T08:49:02Z","timestamp":1776070142000},"page":"364","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Mitigating Distribution Shift in Offline RL-Based Recommender Systems with a Q-Learning Regularization Decision Transformer"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-5685-8044","authenticated-orcid":false,"given":"Yu","family":"Zhou","sequence":"first","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xinyu","family":"Guo","sequence":"additional","affiliation":[{"name":"College of Artificial Intelligence, Nankai University, Tianjin 300350, China"},{"name":"North Automatic Control Technology Institute, Taiyuan 030006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-4444-3910","authenticated-orcid":false,"given":"Yuanbo","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-1376-0188","authenticated-orcid":false,"given":"Jiaxuan","family":"Fang","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3500-2597","authenticated-orcid":false,"given":"Jin-Qiang","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3318-9247","authenticated-orcid":false,"given":"Peng","family":"Zhi","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7139-0462","authenticated-orcid":false,"given":"Gang","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9968-6190","authenticated-orcid":false,"given":"Rui","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8169-4360","authenticated-orcid":false,"given":"Ling-Huey","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computer Science\/DI, University of Salerno, 84084 Fisciano, SA, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8237-8979","authenticated-orcid":false,"given":"Chuanyi","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8054-5446","authenticated-orcid":false,"given":"Qingguo","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1381-4364","authenticated-orcid":false,"given":"Kuan-Ching","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Engineering, Providence University, Taichung 43301, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2026,4,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Jannach, D., Zanker, M., Felfernig, A., and Friedrich, G. (2010). Recommender Systems: An Introduction, Cambridge University Press.","DOI":"10.1017\/CBO9780511763113"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ricci, F., Rokach, L., and Shapira, B. (2011). Introduction to Recommender Systems Handbook. Recommender Systems Handbook, Springer.","DOI":"10.1007\/978-0-387-85820-3"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"47","DOI":"10.26599\/BDMA.2020.9020015","article-title":"Hybrid recommender system for tourism based on big data and AI: A conceptual framework","volume":"4","author":"Nafis","year":"2021","journal-title":"Big Data Min. Anal."},{"key":"ref_4","unstructured":"(2026, January 01). Netflix Update: Try This at Home. Available online: https:\/\/sifter.org\/~simon\/journal\/20061211.html."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"103847","DOI":"10.1016\/j.csi.2024.103847","article-title":"Hybrid collaborative filtering using matrix factorization and XGBoost for movie recommendation","volume":"90","author":"Behera","year":"2024","journal-title":"Comput. Stand. Interfaces"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wang, Y., Feng, D., Li, D., Chen, X., Zhao, Y., and Niu, X. (2016, January 24\u201329). A mobile recommendation system based on logistic regression and gradient boosting decision trees. Proceedings of the International Joint Conference on Neural Networks, Vancouver, BC, Canada.","DOI":"10.1109\/IJCNN.2016.7727431"},{"key":"ref_7","unstructured":"Zhang, Y. (2022, January 25\u201329). An introduction to matrix factorization and factorization machines in recommendation system, and beyond. Proceedings of the 10th International Conference on Learning Representations, Virtual."},{"key":"ref_8","unstructured":"Hong, F., Huang, D., and Chen, G. (February, January 27). Interaction-aware factorization machines for recommender systems. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sun, Y., Pan, J., Zhang, A., and Flores, A. (2021, January 19\u201323). FM2: Field-matrixed factorization machines for recommender systems. Proceedings of the Web Conference 2021, Ljubljana, Slovenia.","DOI":"10.1145\/3442381.3449930"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/j.knosys.2013.03.012","article-title":"Recommender systems survey","volume":"46","author":"Bobadilla","year":"2013","journal-title":"Knowl.-Based Syst."},{"key":"ref_11","first-page":"1","article-title":"On the opportunities and challenges of offline reinforcement learning for recommender systems","volume":"42","author":"Chen","year":"2024","journal-title":"ACM Trans. Inf. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1227","DOI":"10.1109\/TSUSC.2025.3614896","article-title":"FedTCTF: Tensor Completion-Based Federated Learning for Device Heterogeneity","volume":"10","author":"Liang","year":"2025","journal-title":"IEEE Trans. Sustain. Comput."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, Y., Liang, W., Xie, K., Zhang, D., Xie, S., and Li, K.C. (2023, January 17\u201320). LightNestle: Quick and Accurate Neural Sequential Tensor Completion via Meta Learning. Proceedings of the IEEE International Conference on Computer Communications 2023, New York, NY, USA.","DOI":"10.1109\/INFOCOM53939.2023.10228967"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhou, S., Chen, Y., Li, K.C., Liang, W., and Meng, W. (2026). Trusthfl: An efficient aggregation method for trustworthy hierarchical federated learning. IEEE Internet Things J., 1\u201314.","DOI":"10.1109\/JIOT.2026.3673086"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"36646","DOI":"10.1109\/JIOT.2024.3420096","article-title":"Collaborative Federated Learning in Mobile Vehicle Clouds for Online Ride-Hailing Passenger Zones Recommendation","volume":"11","author":"Liao","year":"2024","journal-title":"IEEE Internet Things J."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zheng, G., Zhang, F., Zheng, Z., Xiang, Y., Yuan, N.J., Xie, X., and Li, Z. (2018, January 23\u201327). DRN: A deep reinforcement learning framework for news recommendation. Proceedings of the 2018 World Wide Web Conference, Lyon, France.","DOI":"10.1145\/3178876.3185994"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhao, X., Xia, L., Zhang, L., Ding, Z., Yin, D., and Tang, J. (2018, January 2\u20137). Deep reinforcement learning for page-wise recommendations. Proceedings of the ACM Conference on Recommender Systems, Vancouver, BC, Canada.","DOI":"10.1145\/3240323.3240374"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"25648","DOI":"10.1109\/JIOT.2024.3379363","article-title":"TrustBCFL: Mitigating data bias in IoT through blockchain-enabled federated learning","volume":"11","author":"Zhou","year":"2024","journal-title":"IEEE Internet Things J."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"13164","DOI":"10.1109\/TNNLS.2023.3280161","article-title":"A survey on reinforcement learning for recommender systems","volume":"35","author":"Lin","year":"2023","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_20","unstructured":"Deffayet, R., Thonet, T., Renders, J.-M., and de Rijke, M. (2023, January 23\u201327). Offline evaluation for reinforcement learning-based recommendation: A critical issue and some alternatives. Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Jeunen, O. (2019, January 16\u201320). Revisiting offline evaluation for implicit-feedback recommender systems. Proceedings of the 13th ACM Conference on Recommender Systems, Copenhagen, Denmark.","DOI":"10.1145\/3298689.3347069"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1145\/3569930","article-title":"A critical study on data leakage in recommender system offline evaluation","volume":"41","author":"Ji","year":"2023","journal-title":"ACM Trans. Inf. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Yang, Z., He, X., Zhang, J., Wu, J., Xin, X., Chen, J., and Wang, X. (2023, January 23\u201327). A generic learning framework for sequential recommendation with distribution shifts. Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan.","DOI":"10.1145\/3539618.3591624"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Xiao, T., Chen, Z., and Wang, S. (2023, January 6\u201310). Reconsidering learning objectives in unbiased recommendation: A distribution shift perspective. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Long Beach, CA, USA.","DOI":"10.1145\/3580305.3599487"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Liao, Y., Yang, Y., Hou, M., Wu, L., Xu, H., and Liu, H. (2025, January 13\u201318). Mitigating distribution shifts in sequential recommendation: An invariance perspective. Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, Padua, Italy.","DOI":"10.1145\/3726302.3730036"},{"key":"ref_26","unstructured":"Fujimoto, S., Meger, D., and Precup, D. (2019, January 9\u201315). Off-Policy Deep Reinforcement Learning without Exploration. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_27","unstructured":"Kumar, A., Zhou, A., Tucker, G., Wang, J., and Levine, S. (2020, January 6\u201312). Conservative Q-learning for offline reinforcement learning. Proceedings of the 34th International Conference on Neural Information Processing System, Vancouver, BC, Canada."},{"key":"ref_28","unstructured":"Fujimoto, S., and Gu, S.S. (2021, January 6\u201314). A minimalist approach to offline reinforcement learning. Proceedings of the 35th International Conference on Neural Information Processing Systems, Los Angeles, CA, USA."},{"key":"ref_29","first-page":"15084","article-title":"Decision transformer: Reinforcement learning via sequence modeling","volume":"34","author":"Chen","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chen, X., Xu, H., Zhang, Y., Tang, J., Cao, Y., Qin, Z., and Zha, H. (2018, January 5\u20139). Sequential recommendation with user memory networks. Proceedings of the 11th ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA.","DOI":"10.1145\/3159652.3159668"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"075201","DOI":"10.1088\/1402-4896\/ae40a7","article-title":"An asymmetric graph neural network with time-position encoding for link prediction in dynamic networks","volume":"101","author":"Han","year":"2026","journal-title":"Phys. Scr."},{"key":"ref_32","unstructured":"Hidasi, B., Karatzoglou, A., Baltrunas, L., and Tikk, D. (2016, January 2\u20134). Session-based recommendations with recurrent neural networks. Proceedings of the 4th International Conference on Learning Representations, San Juan, PR, USA."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wu, C.Y., Ahmed, A., Beutel, A., Smola, A.J., and Jing, H. (2017, January 6\u201310). Recurrent recommender networks. Proceedings of the 10th ACM International Conference on Web Search and Data Mining, Cambridge, UK.","DOI":"10.1145\/3018661.3018689"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Kang, W.-C., and McAuley, J. (2018, January 17\u201320). Self-attentive sequential recommendation. Proceedings of the 2018 IEEE International Conference on Data Mining, Singapore.","DOI":"10.1109\/ICDM.2018.00035"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Sun, F., Liu, J., Wu, J., Pei, C., Lin, X., Ou, W., and Jiang, P. (2019, January 3\u20137). BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China.","DOI":"10.1145\/3357384.3357895"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zou, L., Xia, L., Ding, Z., Song, J., Liu, W., and Yin, D. (2019, January 4\u20138). Reinforcement learning to optimize long-term user engagement in recommender systems. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330668"},{"key":"ref_37","unstructured":"Xue, W., Cai, Q., Zhan, R., Zheng, D., Jiang, P., and An, B. (2023, January 1\u20135). ResAct: Reinforcing long-term engagement in sequential recommendation with residual actor. Proceedings of the 11th International Conference on Learning Representations, Kigali, Rwanda."},{"key":"ref_38","first-page":"38","article-title":"Robust federated learning with contrastive learning and meta-learning","volume":"9","author":"Zhang","year":"2025","journal-title":"Int. J. Interact. Multimed. Artif. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhao, X., Zhang, L., Ding, Z., Xia, L., Tang, J., and Yin, D. (2018, January 19\u201323). Recommendations with negative feedback via pairwise deep reinforcement learning. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK.","DOI":"10.1145\/3219819.3219886"},{"key":"ref_40","unstructured":"Chen, M., Wang, Y., Xu, C., Le, Y., Sharma, M., Richardson, L., Wu, S.-L., and Chi, E. (October, January 27). Values of user exploration in recommender systems. Proceedings of the 15th ACM Conference on Recommender Systems, Amsterdam, The Netherlands."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Cai, Q., Liu, S., Wang, X., Zuo, T., Xie, W., Yang, B., Zheng, D., Jiang, P., and Gai, K. (May, January 30). Reinforcing User Retention in a Billion Scale Short Video Recommender System. Proceedings of the ACM Web Conference 2023, Austin, TX, USA.","DOI":"10.1145\/3543873.3584640"},{"key":"ref_42","unstructured":"Levine, S., Kumar, A., Tucker, G., and Fu, J. (2020, January 26\u201330). Offline reinforcement learning: Tutorial, review, and perspectives on open problems. Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia."},{"key":"ref_43","first-page":"1","article-title":"On the theory of policy gradient methods: Optimality, approximation, and distribution shift","volume":"22","author":"Agarwal","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_44","first-page":"1926","article-title":"A survey of offline policy evaluation in reinforcement learning","volume":"45","author":"Wang","year":"2022","journal-title":"Chin. J. Comput."},{"key":"ref_45","unstructured":"Wu, Y., Tucker, G., and Nachum, O. (2019, January 6\u20139). Behavior Regularized Offline Reinforcement Learning. Proceedings of the 7th International Conference on Learning Representations, New Orleans, LA, USA."},{"key":"ref_46","first-page":"7768","article-title":"Critic regularized regression","volume":"33","author":"Wang","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_47","unstructured":"Siegel, N.Y., Rajeswaran, A., Kumar, A., and Levine, S. (2020, January 26\u201330). Keep doing what worked: Behavioral modelling priors for offline reinforcement learning. Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia."},{"key":"ref_48","unstructured":"Xu, H., Chen, L., Zhang, F., Yu, T., and Levine, S. (December, January 28). A policy-guided imitation approach for offline reinforcement learning. Proceedings of the 36th International Conference on Neural Information Processing System, New Orleans, LA, USA."},{"key":"ref_49","first-page":"14129","article-title":"MOPO: Model-based offline policy optimization","volume":"33","author":"Yu","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_50","first-page":"21810","article-title":"MOREL: Model-based offline reinforcement learning","volume":"33","author":"Kidambi","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_51","first-page":"28954","article-title":"COMBO: Conservative offline model-based policy optimization","volume":"34","author":"Yu","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_52","unstructured":"Zhou, W., Bajracharya, S., and Held, D. (2020, January 16\u201318). PLAS: Latent action space for offline reinforcement learning. Proceedings of the 4th Conference on Robot Learning, Cambridge, MA, USA."},{"key":"ref_53","unstructured":"Chen, X., Ghadirzadeh, A., Yu, T., Wang, J., Gao, Y., Li, W., Liang, B., Finn, C., and Zhang, C. (December, January 28). LAPO: Latent-variable advantage-weighted policy optimization for offline reinforcement learning. Proceedings of the 36th International Conference on Neural Information Processing System, New Orleans, LA, USA."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1093\/comjnl\/bxaf119","article-title":"ECaps-GTR: Optimizing spatiotemporal EEG emotion recognition via the augmented capsule-gated transformer","volume":"69","author":"Wang","year":"2026","journal-title":"Comput. J."},{"key":"ref_55","unstructured":"Rendle, S., Freudenthaler, C., Gantner, Z., and Schmidt-Thieme, L. (2009, January 18\u201321). BPR: Bayesian Personalized Ranking from Implicit Feedback. Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"He, R., and McAuley, J. (2016, January 12\u201317). VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.9973"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Zhang, F., Yuan, N.J., Lian, D., Xie, X., and Ma, W. (2016, January 13\u201317). Collaborative Knowledge Base Embedding for Recommender Systems. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939673"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Ai, Q., Chen, X., Chen, Z., Liu, W., and Tang, J. (2017, January 6\u201310). Joint Representation Learning for Top-N Recommendation with Heterogeneous Information Sources. Proceedings of the Conference on Information and Knowledge Management, Singapore.","DOI":"10.1145\/3132847.3132892"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Xian, Y., Fu, Z., Muthukrishnan, S., de Melo, G., and Zhang, Y. (2019, January 21\u201325). Reinforcement Knowledge Graph Reasoning for Explainable Recommendation. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France.","DOI":"10.1145\/3331184.3331203"},{"key":"ref_60","unstructured":"Schnabel, T., Swaminathan, A., Singh, A., Chandak, N., and Joachims, T. (2016, January 19\u201324). Recommendations as treatments: Debiasing learning and evaluation. Proceedings of the International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1145\/3564284","article-title":"Bias and debias in recommender system: A survey and future directions","volume":"41","author":"Chen","year":"2023","journal-title":"ACM Trans. Inf. Syst."},{"key":"ref_62","first-page":"3207","article-title":"Counterfactual reasoning and learning systems: The example of computational advertising","volume":"14","author":"Bottou","year":"2013","journal-title":"J. Mach. Learn. Res."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/4\/364\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T09:34:31Z","timestamp":1776072871000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/4\/364"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,13]]},"references-count":62,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2026,4]]}},"alternative-id":["info17040364"],"URL":"https:\/\/doi.org\/10.3390\/info17040364","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,13]]}}}