{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T09:59:15Z","timestamp":1775815155710,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":37,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,9,22]]},"DOI":"10.1145\/3705328.3748163","type":"proceedings-article","created":{"date-parts":[[2025,9,6]],"date-time":"2025-09-06T10:48:44Z","timestamp":1757155724000},"page":"894-901","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Yambda-5B \u2014 A Large-Scale Multi-Modal Dataset for Ranking and Retrieval"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4535-4204","authenticated-orcid":false,"given":"Alexander","family":"Ploshkin","sequence":"first","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-3960-8689","authenticated-orcid":false,"given":"Vladislav","family":"Tytskiy","sequence":"additional","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-0451-5260","authenticated-orcid":false,"given":"Alexey","family":"Pismenny","sequence":"additional","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-4864-2305","authenticated-orcid":false,"given":"Vladimir","family":"Baikalov","sequence":"additional","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-2128-0701","authenticated-orcid":false,"given":"Evgeny","family":"Taychinov","sequence":"additional","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-5346-243X","authenticated-orcid":false,"given":"Artem","family":"Permiakov","sequence":"additional","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-7957-1712","authenticated-orcid":false,"given":"Daniil","family":"Burlakov","sequence":"additional","affiliation":[{"name":"AIM HIGH TECHNOLOGY, Almaty, Kazakhstan"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-2627-4056","authenticated-orcid":false,"given":"Eugene","family":"Krofto","sequence":"additional","affiliation":[{"name":"Yandex, Moscow, Russian Federation"}]}],"member":"320","published-online":{"date-parts":[[2025,9,7]]},"reference":[{"key":"e_1_3_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330701"},{"key":"e_1_3_3_2_3_2","unstructured":"Newsha Ardalani Carole-Jean Wu Zeliang Chen Bhargav Bhushanam and Adnan Aziz. 2022. Understanding scaling laws for recommendation models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2208.08489 (2022)."},{"key":"e_1_3_3_2_4_2","unstructured":"James Bennett and Stan Lanning. 2007. The netflix prize. (2007)."},{"key":"e_1_3_3_2_5_2","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared\u00a0D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et\u00a0al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877\u20131901."},{"key":"e_1_3_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3240323.3240342"},{"key":"e_1_3_3_2_7_2","unstructured":"Junyoung Chung Caglar Gulcehre KyungHyun Cho and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/1412.3555 (2014)."},{"key":"e_1_3_3_2_8_2","unstructured":"Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly et\u00a0al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2010.11929 (2020)."},{"key":"e_1_3_3_2_9_2","unstructured":"GroupLens. [n. d.]. MovieLens. https:\/\/grouplens.org\/datasets\/movielens\/. Accessed: (2025-05-28)."},{"key":"e_1_3_3_2_10_2","doi-asserted-by":"crossref","unstructured":"F\u00a0Maxwell Harper and Joseph\u00a0A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis) 5 4 (2015) 1\u201319.","DOI":"10.1145\/2827872"},{"key":"e_1_3_3_2_11_2","unstructured":"Bal\u00e1zs Hidasi Alexandros Karatzoglou Linas Baltrunas and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/1511.06939 (2015)."},{"key":"e_1_3_3_2_12_2","doi-asserted-by":"crossref","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation 9 8 (1997) 1735\u20131780.","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_3_2_13_2","unstructured":"Jordan Hoffmann Sebastian Borgeaud Arthur Mensch Elena Buchatskaya Trevor Cai Eliza Rutherford Diego de\u00a0Las Casas Lisa\u00a0Anne Hendricks Johannes Welbl Aidan Clark et\u00a0al. 2022. Training compute-optimal large language models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2203.15556 (2022)."},{"key":"e_1_3_3_2_14_2","unstructured":"Yupeng Hou Jiacheng Li Zhankui He An Yan Xiusi Chen and Julian McAuley. 2024. Bridging Language and Items for Retrieval and Recommendation. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2403.03952 (2024)."},{"key":"e_1_3_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.22"},{"key":"e_1_3_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_3_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401233"},{"key":"e_1_3_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2018.00035"},{"key":"e_1_3_3_2_19_2","unstructured":"Jared Kaplan Sam McCandlish Tom Henighan Tom\u00a0B Brown Benjamin Chess Rewon Child Scott Gray Alec Radford Jeffrey Wu and Dario Amodei. 2020. Scaling laws for neural language models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2001.08361 (2020)."},{"key":"e_1_3_3_2_20_2","unstructured":"Criteo\u00a0AI Lab. [n. d.]. Criteo 1TB Click Logs Dataset. https:\/\/ailab.criteo.com\/download-criteo-1tb-click-logs-dataset. Accessed: (2025-05-12)."},{"key":"e_1_3_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/2507157.2507163"},{"key":"e_1_3_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767755"},{"key":"e_1_3_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3511808.3557656"},{"key":"e_1_3_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1018"},{"key":"e_1_3_3_2_25_2","unstructured":"Shashank Rajput Nikhil Mehta Anima Singh Raghunandan Hulikal\u00a0Keshavan Trung Vu Lukasz Heldt Lichan Hong Yi Tay Vinh Tran Jonah Samost et\u00a0al. 2023. Recommender systems with generative retrieval. Advances in Neural Information Processing Systems 36 (2023) 10299\u201310315."},{"key":"e_1_3_3_2_26_2","unstructured":"Steffen Rendle Christoph Freudenthaler Zeno Gantner and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/1205.2618 (2012)."},{"key":"e_1_3_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/371920.372071"},{"key":"e_1_3_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/2911996.2912004"},{"key":"e_1_3_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3498366.3505791"},{"key":"e_1_3_3_2_30_2","unstructured":"Janne Spijkervet and John\u00a0Ashley Burgoyne. 2021. Contrastive learning of musical representations. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2103.09410 (2021)."},{"key":"e_1_3_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3604915.3608827"},{"key":"e_1_3_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313710"},{"key":"e_1_3_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357895"},{"key":"e_1_3_3_2_34_2","first-page":"96","volume-title":"ISMIR","author":"Vigliensoni Gabriel","year":"2017","unstructured":"Gabriel Vigliensoni and Ichiro Fujinaga. 2017. The music listening histories dataset.. In ISMIR. 96\u2013102."},{"key":"e_1_3_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3124749.3124754"},{"key":"e_1_3_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3442381.3450078"},{"key":"e_1_3_3_2_37_2","unstructured":"Buyun Zhang Liang Luo Yuxin Chen Jade Nie Xi Liu Daifeng Guo Yanli Zhao Shen Li Yuchen Hao Yantao Yao et\u00a0al. 2024. Wukong: Towards a scaling law for large-scale recommendation. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2403.02545 (2024)."},{"key":"e_1_3_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3640457.3688129"}],"event":{"name":"RecSys '25: Nineteenth ACM Conference on Recommender Systems","location":"Prague Czech Republic","acronym":"RecSys '25","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction","SIGAI ACM Special Interest Group on Artificial Intelligence","SIGIR ACM Special Interest Group on Information Retrieval","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data","SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"]},"container-title":["Proceedings of the Nineteenth ACM Conference on Recommender Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3705328.3748163","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,6]],"date-time":"2025-09-06T11:46:32Z","timestamp":1757159192000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3705328.3748163"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,7]]},"references-count":37,"alternative-id":["10.1145\/3705328.3748163","10.1145\/3705328"],"URL":"https:\/\/doi.org\/10.1145\/3705328.3748163","relation":{},"subject":[],"published":{"date-parts":[[2025,9,7]]},"assertion":[{"value":"2025-09-07","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}