{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,6]],"date-time":"2024-09-06T17:47:33Z","timestamp":1725644853953},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2022,12]]},"abstract":"Graph Convolutional Networks (GCN) can efficiently integrate graph structure and node features to learn high-quality node embeddings. At Pinterest, we have developed and deployed PinSage, a data-efficient GCN that learns pin embeddings from the Pin-Board graph. Pinterest relies heavily on PinSage which in turn only leverages the Pin-Board graph. However, there exist several entities at Pinterest and heterogeneous interactions among these entities. These diverse entities and interactions provide important signal for recommendations and modeling. In this work, we show that training deep learning models on graphs that captures these diverse interactions can result in learning higher-quality pin embeddings than training PinSage on only the Pin-Board graph. However, building a large-scale heterogeneous graph engine that can process the entire Pinterest size data has not yet been done. In this work, we present a clever and effective solution where we break the heterogeneous graph into multiple disjoint bipartite graphs and then develop novel data-efficient MultiBiSage model that combines the signals from them. MultiBiSage can capture the graph structure of multiple bipartite graphs to learn high-quality pin embeddings. The benefit of our approach is that individual bipartite graphs can be processed with minimal changes to Pinterest's current infrastructure, while being able to combine information from all the graphs while achieving high performance. We train MultiBiSage on six bipartite graphs including our Pin-Board graph and show that it significantly outperforms the deployed latest version of PinSage on multiple user engagement metrics. We also perform experiments on two public datasets to show that MultiBiSage is generalizable and can be applied to datasets outside of Pinterest.<\/jats:p>","DOI":"10.14778\/3574245.3574262","type":"journal-article","created":{"date-parts":[[2023,2,21]],"date-time":"2023-02-21T23:14:12Z","timestamp":1677021252000},"page":"781-789","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["MultiBiSage"],"prefix":"10.14778","volume":"16","author":[{"given":"Saket","family":"Gurukar","sequence":"first","affiliation":[{"name":"The Ohio State University"}]},{"given":"Nikil","family":"Pancha","sequence":"additional","affiliation":[{"name":"Pinterest"}]},{"given":"Andrew","family":"Zhai","sequence":"additional","affiliation":[{"name":"Pinterest"}]},{"given":"Eric","family":"Kim","sequence":"additional","affiliation":[{"name":"Pinterest"}]},{"given":"Samson","family":"Hu","sequence":"additional","affiliation":[{"name":"Pinterest"}]},{"given":"Srinivasan","family":"Parthasarathy","sequence":"additional","affiliation":[{"name":"The Ohio State University"}]},{"given":"Charles","family":"Rosenberg","sequence":"additional","affiliation":[{"name":"Pinterest"}]},{"given":"Jure","family":"Leskovec","sequence":"additional","affiliation":[{"name":"Stanford"}]}],"member":"320","published-online":{"date-parts":[[2023,2,21]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV51458.2022.00150"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330964"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783296"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.001"},{"key":"e_1_2_1_5_1","volume-title":"Graphzoom: A multi-level spectral approach for accurate and scalable graph embedding. arXiv preprint arXiv:1910.02370","author":"Deng Chenhui","year":"2019","unstructured":"Chenhui Deng , Zhiqiang Zhao , Yongyu Wang , Zhiru Zhang , and Zhuo Feng . 2019 . Graphzoom: A multi-level spectral approach for accurate and scalable graph embedding. arXiv preprint arXiv:1910.02370 (2019). Chenhui Deng, Zhiqiang Zhao, Yongyu Wang, Zhiru Zhang, and Zhuo Feng. 2019. Graphzoom: A multi-level spectral approach for accurate and scalable graph embedding. arXiv preprint arXiv:1910.02370 (2019)."},{"key":"e_1_2_1_6_1","volume-title":"TheWeb Conference.","author":"Eksombatchai Chantat","year":"2018","unstructured":"Chantat Eksombatchai , Pranav Jindal , Jerry Zitao Liu , Yuchen Liu , Rahul Sharma , Charles Sugnet , Mark Ulrich , and Jure Leskovec . 2018 . Pixie: A system for recommending 3+ billion items to 200+ million users in real-time . In TheWeb Conference. Chantat Eksombatchai, Pranav Jindal, Jerry Zitao Liu, Yuchen Liu, Rahul Sharma, Charles Sugnet, Mark Ulrich, and Jure Leskovec. 2018. Pixie: A system for recommending 3+ billion items to 200+ million users in real-time. In TheWeb Conference."},{"key":"e_1_2_1_7_1","volume-title":"Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428","author":"Fey Matthias","year":"2019","unstructured":"Matthias Fey and Jan Eric Lenssen . 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 ( 2019 ). Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019)."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380297"},{"key":"e_1_2_1_9_1","unstructured":"Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed deep graph learning at scale. In USENIX. 551--568. Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed deep graph learning at scale. In USENIX. 551--568."},{"key":"e_1_2_1_10_1","volume-title":"large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677","author":"Goyal Priya","year":"2017","unstructured":"Priya Goyal , Piotr Doll\u00e1r , Ross Girshick , Pieter Noordhuis , Lukasz Wesolowski , Aapo Kyrola , Andrew Tulloch , Yangqing Jia , and Kaiming He. 2017. Accurate , large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 ( 2017 ). Priya Goyal, Piotr Doll\u00e1r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)."},{"key":"e_1_2_1_11_1","volume-title":"MultiBiSage: A Web-Scale Recommendation System Using Multiple Bipartite Graphs at Pinterest. arXiv preprint arXiv:2205.10666","author":"Gurukar Saket","year":"2022","unstructured":"Saket Gurukar , Nikil Pancha , Andrew Zhai , Eric Kim , Samson Hu , Srinivasan Parthasarathy , Charles Rosenberg , and Jure Leskovec . 2022. MultiBiSage: A Web-Scale Recommendation System Using Multiple Bipartite Graphs at Pinterest. arXiv preprint arXiv:2205.10666 ( 2022 ). Saket Gurukar, Nikil Pancha, Andrew Zhai, Eric Kim, Samson Hu, Srinivasan Parthasarathy, Charles Rosenberg, and Jure Leskovec. 2022. MultiBiSage: A Web-Scale Recommendation System Using Multiple Bipartite Graphs at Pinterest. arXiv preprint arXiv:2205.10666 (2022)."},{"key":"e_1_2_1_12_1","volume-title":"DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. In 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC). IEEE, 282--291","author":"He Yuntian","year":"2021","unstructured":"Yuntian He , Saket Gurukar , Pouya Kousha , Hari Subramoni , Dhabaleswar K Panda , and Srinivasan Parthasarathy . 2021 . DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. In 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC). IEEE, 282--291 . Yuntian He, Saket Gurukar, Pouya Kousha, Hari Subramoni, Dhabaleswar K Panda, and Srinivasan Parthasarathy. 2021. DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. In 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC). IEEE, 282--291."},{"key":"e_1_2_1_13_1","volume-title":"Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687","author":"Hu Weihua","year":"2020","unstructured":"Weihua Hu , Matthias Fey , Marinka Zitnik , Yuxiao Dong , Hongyu Ren , Bowen Liu , Michele Catasta , and Jure Leskovec . 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687 ( 2020 ). Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687 (2020)."},{"key":"e_1_2_1_14_1","volume-title":"Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33","author":"Hu Weihua","year":"2020","unstructured":"Weihua Hu , Matthias Fey , Marinka Zitnik , Yuxiao Dong , Hongyu Ren , Bowen Liu , Michele Catasta , and Jure Leskovec . 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 ( 2020 ), 22118--22133. Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118--22133."},{"key":"e_1_2_1_15_1","volume-title":"TheWeb Conference. 2704--2710","author":"Hu Ziniu","year":"2020","unstructured":"Ziniu Hu , Yuxiao Dong , Kuansan Wang , and Yizhou Sun . 2020 . Heterogeneous graph transformer . In TheWeb Conference. 2704--2710 . Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. 2020. Heterogeneous graph transformer. In TheWeb Conference. 2704--2710."},{"key":"e_1_2_1_16_1","volume-title":"On using very large target vocabulary for neural machine translation. ACL","author":"Jean S\u00e9bastien","year":"2015","unstructured":"S\u00e9bastien Jean , Kyunghyun Cho , Roland Memisevic , and Yoshua Bengio . 2015. On using very large target vocabulary for neural machine translation. ACL ( 2015 ). S\u00e9bastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On using very large target vocabulary for neural machine translation. ACL (2015)."},{"key":"e_1_2_1_17_1","volume-title":"TheWeb Conference. 1581--1591","author":"Zhuoren","unstructured":"Zhuoren Jiang et al. 2020. Task-oriented genetic activation for large-scale complex heterogeneous graph embedding . In TheWeb Conference. 1581--1591 . Zhuoren Jiang et al. 2020. Task-oriented genetic activation for large-scale complex heterogeneous graph embedding. In TheWeb Conference. 1581--1591."},{"key":"e_1_2_1_18_1","first-page":"359","article-title":"A fast and high quality multilevel scheme for partitioning irregular graphs","volume":"20","author":"Karypis George","year":"1998","unstructured":"George Karypis and Vipin Kumar . 1998 . A fast and high quality multilevel scheme for partitioning irregular graphs . SDM 20 , 1 (1998), 359 -- 392 . George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SDM 20, 1 (1998), 359--392.","journal-title":"SDM"},{"key":"e_1_2_1_19_1","volume-title":"Mile: A multi-level framework for scalable graph embedding. arXiv preprint arXiv:1802.09612","author":"Liang Jiongqian","year":"2018","unstructured":"Jiongqian Liang , Saket Gurukar , and Srinivasan Parthasarathy . 2018 . Mile: A multi-level framework for scalable graph embedding. arXiv preprint arXiv:1802.09612 (2018). Jiongqian Liang, Saket Gurukar, and Srinivasan Parthasarathy. 2018. Mile: A multi-level framework for scalable graph embedding. arXiv preprint arXiv:1802.09612 (2018)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Jiongqian Liang Peter Jacobs Jiankai Sun and Srinivasan Parthasarathy. 2018. Semi-supervised embedding in attributed networks with outliers. In SDM. Jiongqian Liang Peter Jacobs Jiankai Sun and Srinivasan Parthasarathy. 2018. Semi-supervised embedding in attributed networks with outliers. In SDM.","DOI":"10.1137\/1.9781611975321.18"},{"key":"e_1_2_1_21_1","volume-title":"Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983","author":"Loshchilov Ilya","year":"2016","unstructured":"Ilya Loshchilov and Frank Hutter . 2016 . Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016). Ilya Loshchilov and Frank Hutter. 2016. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)."},{"key":"e_1_2_1_22_1","unstructured":"Pinterest. 2022. Your audience is here. https:\/\/business.pinterest.com\/en\/audience\/. Pinterest. 2022. Your audience is here. https:\/\/business.pinterest.com\/en\/audience\/."},{"key":"e_1_2_1_23_1","volume-title":"Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868","author":"Shchur Oleksandr","year":"2018","unstructured":"Oleksandr Shchur , Maximilian Mumme , Aleksandar Bojchevski , and Stephan G\u00fcnnemann . 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 ( 2018 ). Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan G\u00fcnnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783307"},{"key":"e_1_2_1_25_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS."},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Xiao Wang Houye Ji Chuan Shi Bai Wang Yanfang Ye Peng Cui and Philip S Yu. 2019. Heterogeneous graph attention network. In The world wide web conference. 2022--2032. Xiao Wang Houye Ji Chuan Shi Bai Wang Yanfang Ye Peng Cui and Philip S Yu. 2019. Heterogeneous graph attention network. In The world wide web conference. 2022--2032.","DOI":"10.1145\/3308558.3313562"},{"key":"e_1_2_1_27_1","volume-title":"Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations. In TheWeb Conference. 441--447","author":"Ji","unstructured":"Ji Yang et al. 2020 . Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations. In TheWeb Conference. 441--447 . Ji Yang et al. 2020. Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations. In TheWeb Conference. 441--447."},{"key":"e_1_2_1_28_1","volume-title":"International conference on machine learning. PMLR, 40--48","author":"Yang Zhilin","year":"2016","unstructured":"Zhilin Yang , William Cohen , and Ruslan Salakhudinov . 2016 . Revisiting semi-supervised learning with graph embeddings . In International conference on machine learning. PMLR, 40--48 . Zhilin Yang, William Cohen, and Ruslan Salakhudinov. 2016. Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning. PMLR, 40--48."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219890"},{"key":"e_1_2_1_30_1","volume-title":"Scalable graph neural networks for heterogeneous graphs. arXiv preprint arXiv:2011.09679","author":"Yu Lingfan","year":"2020","unstructured":"Lingfan Yu , Jiajun Shen , Jinyang Li , and Adam Lerer . 2020. Scalable graph neural networks for heterogeneous graphs. arXiv preprint arXiv:2011.09679 ( 2020 ). Lingfan Yu, Jiajun Shen, Jinyang Li, and Adam Lerer. 2020. Scalable graph neural networks for heterogeneous graphs. arXiv preprint arXiv:2011.09679 (2020)."},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Chuxu Zhang Dongjin Song Chao Huang Ananthram Swami and Nitesh V Chawla. 2019. Heterogeneous graph neural network. In KDD. 793--803. Chuxu Zhang Dongjin Song Chao Huang Ananthram Swami and Nitesh V Chawla. 2019. Heterogeneous graph neural network. In KDD. 793--803.","DOI":"10.1145\/3292500.3330961"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3289600.3291001"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2585--2595","author":"Zhang Fanjin","year":"2019","unstructured":"Fanjin Zhang , Xiao Liu , Jie Tang , Yuxiao Dong , Peiran Yao , Jie Zhang , Xiaotao Gu , Yan Wang , Bin Shao , Rui Li , 2019 . Oag: Toward linking large-scale heterogeneous entity graphs . In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2585--2595 . Fanjin Zhang, Xiao Liu, Jie Tang, Yuxiao Dong, Peiran Yao, Jie Zhang, Xiaotao Gu, Yan Wang, Bin Shao, Rui Li, et al. 2019. Oag: Toward linking large-scale heterogeneous entity graphs. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2585--2595."},{"key":"e_1_2_1_34_1","volume-title":"2020 IEEE\/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 36--44","author":"Da","unstructured":"Da Zheng et al. 2020. Distdgl: distributed graph neural network training for billion-scale graphs . In 2020 IEEE\/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 36--44 . Da Zheng et al. 2020. Distdgl: distributed graph neural network training for billion-scale graphs. In 2020 IEEE\/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 36--44."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403169"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2653--2661","author":"Zhuang Jinfeng","year":"2019","unstructured":"Jinfeng Zhuang and Yu Liu . 2019 . PinText: A multitask text embedding system in pinterest . In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2653--2661 . Jinfeng Zhuang and Yu Liu. 2019. PinText: A multitask text embedding system in pinterest. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2653--2661."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3574245.3574262","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,21]],"date-time":"2023-02-21T23:15:23Z","timestamp":1677021323000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3574245.3574262"}},"subtitle":["A Web-Scale Recommendation System Using Multiple Bipartite Graphs at Pinterest"],"short-title":[],"issued":{"date-parts":[[2022,12]]},"references-count":36,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["10.14778\/3574245.3574262"],"URL":"http:\/\/dx.doi.org\/10.14778\/3574245.3574262","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2022,12]]},"assertion":[{"value":"2023-02-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}