{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,30]],"date-time":"2026-05-30T01:11:40Z","timestamp":1780103500466,"version":"3.54.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:p>\n            Given a graph\n            <jats:italic>G<\/jats:italic>\n            where each node is associated with a set of attributes,\n            <jats:italic>attributed network embedding (ANE)<\/jats:italic>\n            maps each node\n            <jats:italic>v<\/jats:italic>\n            \u2208\n            <jats:italic>G<\/jats:italic>\n            to a compact vector\n            <jats:italic>\n              X\n              <jats:sub>v<\/jats:sub>\n            <\/jats:italic>\n            , which can be used in downstream machine learning tasks. Ideally,\n            <jats:italic>\n              X\n              <jats:sub>v<\/jats:sub>\n            <\/jats:italic>\n            should capture node\n            <jats:italic>v<\/jats:italic>\n            's\n            <jats:italic>affinity<\/jats:italic>\n            to each attribute, which considers not only\n            <jats:italic>v<\/jats:italic>\n            's own attribute associations, but also those of its connected nodes along edges in\n            <jats:italic>G<\/jats:italic>\n            . It is challenging to obtain high-utility embeddings that enable accurate predictions; scaling effective ANE computation to massive graphs with millions of nodes pushes the difficulty of the problem to a whole new level. Existing solutions largely fail on such graphs, leading to prohibitive costs, low-quality embeddings, or both.\n          <\/jats:p>\n          <jats:p>\n            This paper proposes PANE, an effective and scalable approach to ANE computation for massive graphs that achieves state-of-the-art result quality on multiple benchmark datasets, measured by the accuracy of three common prediction tasks: attribute inference, link prediction, and node classification. In particular, for the large\n            <jats:italic>MAG<\/jats:italic>\n            data with over 59 million nodes, 0.98 billion edges, and 2000 attributes, PANE is the only known viable solution that obtains effective embeddings on a single server, within 12 hours.\n          <\/jats:p>\n          <jats:p>PANE obtains high scalability and effectiveness through three main algorithmic designs. First, it formulates the learning objective based on a novel random walk model for attributed networks. The resulting optimization task is still challenging on large graphs. Second, PANE includes a highly efficient solver for the above optimization problem, whose key module is a carefully designed initialization of the embeddings, which drastically reduces the number of iterations required to converge. Finally, PANE utilizes multi-core CPUs through non-trivial parallelization of the above solver, which achieves scalability while retaining the high quality of the resulting embeddings. Extensive experiments, comparing 10 existing approaches on 8 real datasets, demonstrate that PANE consistently outperforms all existing methods in terms of result quality, while being orders of magnitude faster.<\/jats:p>","DOI":"10.14778\/3421424.3421430","type":"journal-article","created":{"date-parts":[[2020,10,28]],"date-time":"2020-10-28T01:15:11Z","timestamp":1603847711000},"page":"37-49","source":"Crossref","is-referenced-by-count":47,"title":["Scaling attributed network embedding to massive graphs"],"prefix":"10.14778","volume":"14","author":[{"given":"Renchi","family":"Yang","sequence":"first","affiliation":[{"name":"Nanyang Technological University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jieming","family":"Shi","sequence":"additional","affiliation":[{"name":"Hong Kong Polytechnic University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaokui","family":"Xiao","sequence":"additional","affiliation":[{"name":"National University of Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yin","family":"Yang","sequence":"additional","affiliation":[{"name":"Hamad bin Khalifa University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Juncheng","family":"Liu","sequence":"additional","affiliation":[{"name":"National University of Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sourav S.","family":"Bhowmick","sequence":"additional","affiliation":[{"name":"Nanyang Technological University"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2020,10,27]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371788"},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","unstructured":"L\u00e9on Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In COMPSTAT. 177--186.  L\u00e9on Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In COMPSTAT. 177--186.","DOI":"10.1007\/978-3-7908-2604-3_16"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330964"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Pierre Comon Xavier Luciani and Andr\u00e9 LF De Almeida. 2009. Tensor decompositions alternating least squares and other tales. J. Chemom. (2009) 393--405.  Pierre Comon Xavier Luciani and Andr\u00e9 LF De Almeida. 2009. Tensor decompositions alternating least squares and other tales. J. Chemom. (2009) 393--405.","DOI":"10.1002\/cem.1236"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022627411411"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1977.tb01600.x"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304235"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330866"},{"key":"e_1_2_1_10_1","volume-title":"Singular value decomposition and least squares solutions. Linear Algebra","author":"Golub Gene H","year":"1971","unstructured":"Gene H Golub and Christian Reinsch . 1971. Singular value decomposition and least squares solutions. Linear Algebra ( 1971 ), 134--151. Gene H Golub and Christian Reinsch. 1971. Singular value decomposition and least squares solutions. Linear Algebra (1971), 134--151."},{"key":"e_1_2_1_11_1","volume-title":"Matrix computations","author":"Golub Gene H","year":"1996","unstructured":"Gene H Golub and Charles F Van Loan . 1996. Matrix computations . 1996 . Johns Hopkins University , Press, Baltimore, MD, USA (1996). Gene H Golub and Charles F Van Loan. 1996. Matrix computations. 1996. Johns Hopkins University, Press, Baltimore, MD, USA (1996)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/3086952"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/2969033.2969125"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939754"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/3294771.3294869"},{"key":"e_1_2_1_16_1","volume-title":"Long short-term memory. Neural computation","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber . 1997. Long short-term memory. Neural computation ( 1997 ), 1735--1780. Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997), 1735--1780."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330948"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Xiao Huang Jundong Li and Xia Hu. 2017. Accelerated attributed network embedding. In SDM.  Xiao Huang Jundong Li and Xia Hu. 2017. Accelerated attributed network embedding. In SDM.","DOI":"10.1145\/3018661.3018667"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/775152.775191"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/3367243.3367410"},{"key":"e_1_2_1_21_1","unstructured":"Kaggle. 2012. KDD Cup. https:\/\/www.kaggle.com\/c\/kddcup2012-track1.  Kaggle. 2012. KDD Cup. https:\/\/www.kaggle.com\/c\/kddcup2012-track1."},{"key":"e_1_2_1_22_1","volume-title":"Semi-supervised classification with graph convolutional networks. ICLR","author":"Kipf Thomas N","year":"2016","unstructured":"Thomas N Kipf and Max Welling . 2016. Semi-supervised classification with graph convolutional networks. ICLR ( 2016 ). Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. ICLR (2016)."},{"key":"e_1_2_1_23_1","unstructured":"Adam Lerer Ledell Wu Jiajun Shen Timothee Lacroix Luca Wehrstedt Abhijit Bose and Alex Peysakhovich. 2019. PyTorch-BigGraph: A Large-scale Graph Embedding System. In SysML.  Adam Lerer Ledell Wu Jiajun Shen Timothee Lacroix Luca Wehrstedt Abhijit Bose and Alex Peysakhovich. 2019. PyTorch-BigGraph: A Large-scale Graph Embedding System. In SysML."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999134.2999195"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219988"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220062"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3289600.3291015"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377850"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/2969239.2969395"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/3304889.3305023"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623732"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/576628"},{"key":"e_1_2_1_33_1","volume-title":"Complex Networks","author":"Sheikh Nasrullah","unstructured":"Nasrullah Sheikh , Zekarias T Kefato , and Alberto Montresor . 2019. A Simple Approach to Attributed Graph Embedding via Enhanced Autoencoder . In Complex Networks . Springer , 797--809. Nasrullah Sheikh, Zekarias T Kefato, and Alberto Montresor. 2019. A Simple Approach to Attributed Graph Embedding via Enhanced Autoencoder. In Complex Networks. Springer, 797--809."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2740908.2742839"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741093"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.70"},{"key":"e_1_2_1_37_1","unstructured":"Petar Veli\u010dkovi\u0107 William Fedus William L. Hamilton Pietro Li\u00f2 Yoshua Bengio and R Devon Hjelm. 2019. Deep Graph Infomax. In ICLR.  Petar Veli\u010dkovi\u0107 William Fedus William L. Hamilton Pietro Li\u00f2 Yoshua Bengio and R Devon Hjelm. 2019. Deep Graph Infomax. In ICLR."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-015-0892-3"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358091"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.5555\/3304889.3305058"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/2832415.2832542"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3041021.3054181"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/3367471.3367604"},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Hong Yang Shirui Pan Peng Zhang Ling Chen Defu Lian and Chengqi Zhang. 2018. Binarized attributed network embedding. In ICDM.  Hong Yang Shirui Pan Peng Zhang Ling Chen Defu Lian and Chengqi Zhang. 2018. Binarized attributed network embedding. In ICDM.","DOI":"10.1109\/ICDM.2018.8626170"},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Jaewon Yang Julian McAuley and Jure Leskovec. 2013. Community detection in networks with node attributes. In ICDM.  Jaewon Yang Julian McAuley and Jure Leskovec. 2013. Community detection in networks with node attributes. In ICDM.","DOI":"10.1109\/ICDM.2013.167"},{"key":"e_1_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Renchi Yang Jieming Shi Xiaokui Xiao Yin Yang and Sourav S Bhowmick. 2020. Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank. In PVLDB. 670--683.  Renchi Yang Jieming Shi Xiaokui Xiao Yin Yang and Sourav S Bhowmick. 2020. Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank. In PVLDB. 670--683.","DOI":"10.14778\/3377369.3377376"},{"key":"e_1_2_1_47_1","volume-title":"Bhowmick","author":"Yang Renchi","year":"2020","unstructured":"Renchi Yang , Jieming Shi , Xiaokui Xiao , Yin Yang , Juncheng Liu , and Sourav S . Bhowmick . 2020 . Scaling Attributed Network Embedding to Massive Graphs . arXiv preprint (2020). Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, Juncheng Liu, and Sourav S. Bhowmick. 2020. Scaling Attributed Network Embedding to Massive Graphs. arXiv preprint (2020)."},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Daokun Zhang Jie Yin Xingquan Zhu and Chengqi Zhang. 2016. Homophily structure and content augmented network representation learning. In ICDM.  Daokun Zhang Jie Yin Xingquan Zhu and Chengqi Zhang. 2016. Homophily structure and content augmented network representation learning. In ICDM.","DOI":"10.1109\/ICDM.2016.0072"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.5555\/3304889.3305099"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298483.3298661"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3271741"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352127"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3421424.3421430","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:19:47Z","timestamp":1672226387000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3421424.3421430"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":52,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["10.14778\/3421424.3421430"],"URL":"https:\/\/doi.org\/10.14778\/3421424.3421430","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2020,9]]}}}