{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:43:26Z","timestamp":1760060606485,"version":"build-2065373602"},"reference-count":21,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T00:00:00Z","timestamp":1756857600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>Clustering techniques significantly enhance recommender systems by improving predictive accuracy and interpretability, particularly in sparse, high-dimensional datasets. This research presents a comprehensive comparative analysis of traditional clustering methods such as K-means and Fuzzy C-Means (FCM) against advanced probabilistic clustering methodologies based on Non-negative Matrix Factorization (NMF), focusing specifically on Bayesian NMF. Experiments conducted using the widely recognized MovieLens 1M dataset reveal Bayesian NMF\u2019s superior performance in terms of predictive accuracy, intra-cluster cohesion, and interpretability compared to classical methods. The study systematically evaluates the influence of key parameters such as overlap (\u03b1) and evidence threshold (\u03b2) in Bayesian NMF, demonstrating that careful parameter tuning substantially improves recommendation quality. The results highlight the inherent trade-off between cluster cohesion and predictive accuracy, emphasizing the flexibility and robustness of probabilistic approaches in accurately modeling user preferences and behaviors. The paper concludes by proposing future directions, including the exploration of hybrid clustering methods, dynamic adaptation to evolving user preferences, and integration of contextual information, thereby fostering continued advances in personalized-recommendation research.<\/jats:p>","DOI":"10.3390\/computation13090213","type":"journal-article","created":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T15:15:57Z","timestamp":1756912557000},"page":"213","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Matrix Factorization-Based Clustering for Sparse Data in Recommender Systems: A Comparative Study"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6045-8692","authenticated-orcid":false,"given":"Rodolfo","family":"Bojorque","sequence":"first","affiliation":[{"name":"Campus El Vecino, Universidad Polit\u00e9cnica Salesiana, Cuenca 010102, Ecuador"},{"name":"Math Innovation Group, Universidad Polit\u00e9cnica Salesiana, Cuenca 010102, Ecuador"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7472-9417","authenticated-orcid":false,"given":"Remigio","family":"Hurtado","sequence":"additional","affiliation":[{"name":"Campus El Vecino, Universidad Polit\u00e9cnica Salesiana, Cuenca 010102, Ecuador"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1016\/j.patrec.2009.09.011","article-title":"Data clustering: 50 years beyond K-means","volume":"31","author":"Jain","year":"2010","journal-title":"Pattern Recognit. Lett."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1109\/MC.2009.263","article-title":"Matrix factorization techniques for recommender systems","volume":"42","author":"Koren","year":"2009","journal-title":"Computer"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/0098-3004(84)90020-7","article-title":"FCM: The fuzzy c-means clustering algorithm","volume":"10","author":"Bezdek","year":"1984","journal-title":"Comput. Geosci."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1016\/j.knosys.2015.12.018","article-title":"A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model","volume":"97","author":"Hernando","year":"2016","journal-title":"Knowl.-Based Syst."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3549","DOI":"10.1109\/ACCESS.2017.2788138","article-title":"Recommender Systems Clustering Using Bayesian Non Negative Matrix Factorization","volume":"6","author":"Bobadilla","year":"2018","journal-title":"IEEE Access"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1016\/j.ins.2015.03.062","article-title":"Novel centroid selection approaches for KMeans-clustering based recommender systems","volume":"320","author":"Zahra","year":"2015","journal-title":"Inf. Sci."},{"key":"ref_7","first-page":"281","article-title":"Some methods for classification and analysis of multivariate observations","volume":"Volume 1","author":"MacQueen","year":"1967","journal-title":"Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability"},{"key":"ref_8","unstructured":"Arthur, D., and Vassilvitskii, S. (2007, January 7\u20139). K-means++: The Advantages of Careful Seeding. Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA. Society for Industrial and Applied Mathematics."},{"key":"ref_9","unstructured":"Bezdek, J.C. (2013). Pattern Recognition with Fuzzy Objective Function Algorithms, Springer Science & Business Media."},{"key":"ref_10","unstructured":"Salakhutdinov, R., and Mnih, A. (2007, January 3\u20136). Probabilistic matrix factorization. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hofmann, T. (1999, January 15\u201319). Probabilistic latent semantic indexing. Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA.","DOI":"10.1145\/312624.312649"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1145\/963770.963774","article-title":"Latent semantic models for collaborative filtering","volume":"22","author":"Hofmann","year":"2004","journal-title":"ACM Trans. Inf. Syst. (TOIS)"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9","article-title":"Indexing by latent semantic analysis","volume":"41","author":"Deerwester","year":"1990","journal-title":"J. Am. Soc. Inf. Sci."},{"key":"ref_14","first-page":"993","article-title":"Latent Dirichlet Allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Bojorque, R., Arcos, M., Plaza, A., and Morquecho, P. (2024, January 9\u201311). Clustering Sparse Matrices Using Dimensional Reduction in Recommender Systems. Proceedings of the Management, Tourism and Smart Technologies (ICMTT 2024), Cusco, Peru. 1190 LNNS.","DOI":"10.1007\/978-3-031-74825-7_11"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1109\/TPAMI.1979.4766909","article-title":"A cluster separation measure","volume":"PAMI-1","author":"Davies","year":"1979","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhao, Q., Xu, M., and Fr\u00e4nti, P. (2009, January 23\u201325). Sum-of-squares based cluster validity index and significance analysis. Proceedings of the International Conference on Adaptive and Natural Computing Algorithms, Kuopio, Finland.","DOI":"10.1007\/978-3-642-04921-7_32"},{"key":"ref_19","unstructured":"Palma, F.D., Carri\u00e8re, B., and Varoquaux, G. (2025). Do LLMs Memorize Recommendation Datasets? A Case Study on MovieLens-1M. arXiv."},{"key":"ref_20","unstructured":"Gao, T., Yu, S., Wang, F., Chen, B., Shan, S., and Chen, X. (2025). Matrix Factorization with Dynamic Multi-view Clustering for Recommender System. arXiv."},{"key":"ref_21","unstructured":"Orme, D., Hao, Z., Liatsis, P., Jin, Y., and Yang, L. (2025). Multi-view Biclustering via Non-negative Matrix Tri-factorisation. arXiv."}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/9\/213\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:38:55Z","timestamp":1760035135000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/9\/213"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,3]]},"references-count":21,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["computation13090213"],"URL":"https:\/\/doi.org\/10.3390\/computation13090213","relation":{},"ISSN":["2079-3197"],"issn-type":[{"type":"electronic","value":"2079-3197"}],"subject":[],"published":{"date-parts":[[2025,9,3]]}}}