{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T15:39:34Z","timestamp":1766504374595,"version":"3.41.0"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"5","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>Graph Transformers have garnered significant attention due to their ability to address the challenges of long-distance interactions in previous GNNs. However, most current graph Transformers face difficulties when dealing with heterophilic graphs. To investigate this issue, we first analyzed the distribution of attention weights for homophilic and heterophilic graphs. We discovered that heterophily interferes with the allocation of attention weights, leading to errors in node classification. Further investigation revealed that the root cause may be the difficulty of current graph Transformers in capturing the difference between the features of each node and its neighbors. To alleviate this issue, we propose a position encoding strategy called DiSP to better capture the feature difference and introduce FDphormer, a new efficient and simple graph Transformer model based on DiSP. Additionally, we analyze the generalization error of existing graph Transformer models and provide an upper bound on the generalization error of current graph Transformers with the introduction of DiSP. Extensive experiments demonstrate that FDphormer not only outperforms state-of-the-art methods on diverse heterogeneous datasets but also exhibits competitive performance under homophily.<\/jats:p>","DOI":"10.1145\/3727882","type":"journal-article","created":{"date-parts":[[2025,4,15]],"date-time":"2025-04-15T16:25:46Z","timestamp":1744734346000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["FDphormer: Beyond Homophily with Feature-Difference Position Encoding"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-8307-9904","authenticated-orcid":false,"given":"Dong","family":"Li","sequence":"first","affiliation":[{"name":"Institute for Advanced Study in Mathematics, Harbin Institute of Technology (HIT), Harbin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-4144-1336","authenticated-orcid":false,"given":"Aijia","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Harbin Institute of Technology (HIT), Harbin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5401-7834","authenticated-orcid":false,"given":"Huan","family":"Xiong","sequence":"additional","affiliation":[{"name":"Institute for Advanced Study in Mathematics, Harbin Institute of Technology (HIT), Harbin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4072-0577","authenticated-orcid":false,"given":"Biqing","family":"Qi","sequence":"additional","affiliation":[{"name":"Shanghai Artificial Intelligence Laboratory, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-1644-5812","authenticated-orcid":false,"given":"Junqi","family":"Gao","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Harbin Institute of Technology (HIT), Harbin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,6,16]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"21","volume-title":"International Conference on Machine Learning","author":"Abu-El-Haija Sami","year":"2019","unstructured":"Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, and Aram Galstyan. 2019. Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In International Conference on Machine Learning. PMLR, 21\u201329."},{"key":"e_1_3_1_3_2","first-page":"463","article-title":"Rademacher and Gaussian complexities: Risk bounds and structural results","volume":"3","author":"Bartlett Peter L.","year":"2002","unstructured":"Peter L. Bartlett and Shahar Mendelson. 2002. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3 (Nov. 2002), 463\u2013482.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_1_4_2","unstructured":"Deyu Bo Chuan Shi Lele Wang and Renjie Liao. 2023. Specformer: Spectral graph neural networks meet transformers. ArXiv:2303.01028. Retrieved from https:\/\/arxiv.org\/abs\/2303.01028"},{"key":"e_1_3_1_5_2","first-page":"1576","volume-title":"International Conference on Machine Learning","author":"Chen Dexiong","year":"2020","unstructured":"Dexiong Chen, Laurent Jacob, and Julien Mairal. 2020. Convolutional kernel networks for graph-structured data. In International Conference on Machine Learning. PMLR, 1576\u20131586."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.5555\/3524938.3525099"},{"key":"e_1_3_1_7_2","unstructured":"Zihang Dai Zhilin Yang Yiming Yang Jaime Carbonell Quoc V. Le and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. ArXiv:1901.02860. Retrieved from https:\/\/arxiv.org\/abs\/1901.02860"},{"key":"e_1_3_1_8_2","first-page":"4171","volume-title":"Proceedings of the NAACL-HLT","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, and Kenton Lee. 2019. Google, KT, Language, AI: BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the NAACL-HLT, 4171\u20134186."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/0022-1236(67)90017-1"},{"key":"e_1_3_1_10_2","unstructured":"Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv:2012.09699v1. Retrieved from https:\/\/arxiv.org\/abs\/arXiv:2012.09699v1"},{"key":"e_1_3_1_11_2","first-page":"5793","volume-title":"International Conference on Machine Learning","author":"Edelman Benjamin L.","year":"2022","unstructured":"Benjamin L. Edelman, Surbhi Goel, Sham Kakade, and Cyril Zhang. 2022. Inductive biases and variable creation in self-attention mechanisms. In International Conference on Machine Learning. PMLR, 5793\u20135831."},{"key":"e_1_3_1_12_2","article-title":"Inductive representation learning on large graphs","volume":"30","author":"Hamilton Will","year":"2017","unstructured":"Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, Vol. 30.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_13_2","unstructured":"Md Shamim Hussain Mohammed J. Zaki and Dharmashankar Subramanian. 2021. Edge-augmented graph transformers: Global self-attention is enough for graphs. ArXiv:2108.03348. Retrieved from https:\/\/arxiv.org\/abs\/2108.03348"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3539296"},{"key":"e_1_3_1_15_2","unstructured":"Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. ArXiv:1609.02907. Retrieved from https:\/\/arxiv.org\/abs\/1609.02907"},{"key":"e_1_3_1_16_2","first-page":"21618","article-title":"Rethinking graph transformers with spectral attention","volume":"34","author":"Kreuzer Devin","year":"2021","unstructured":"Devin Kreuzer, Dominique Beaini, Will Hamilton, Vincent L\u00e9tourneau, and Prudencio Tossou. 2021. Rethinking graph transformers with spectral attention. In Advances in Neural Information Processing Systems, Vol. 34, 21618\u201321629.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01270"},{"key":"e_1_3_1_18_2","unstructured":"Gr\u00e9goire Mialon Dexiong Chen Margot Selosse and Julien Mairal. 2021. Graphit: Encoding graph structure in transformers. ArXiv:2106.05667. Retrieved from https:\/\/arxiv.org\/abs\/2106.05667"},{"key":"e_1_3_1_19_2","first-page":"12559","volume-title":"Advances in Neural Information Processing Systems","volume":"33","author":"Rong Yu","year":"2020","unstructured":"Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. 2020. Self-supervised graph transformer on large-scale molecular data. In Advances in Neural Information Processing Systems. H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 12559\u201312571. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf"},{"key":"e_1_3_1_20_2","unstructured":"Peter Shaw Jakob Uszkoreit and Ashish Vaswani. 2018. Self-attention with relative position representations. ArXiv:1803.02155. Retrieved from https:\/\/arxiv.org\/abs\/1803.02155"},{"key":"e_1_3_1_21_2","unstructured":"Yunchong Song Chenghu Zhou Xinbing Wang and Zhouhan Lin. 2023. Ordered GNN: Ordering message passing to deal with heterophily and over-smoothing. ArXiv:2302.01524. Retrieved from https:\/\/arxiv.org\/abs\/2302.01524"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2023.127063"},{"key":"e_1_3_1_23_2","unstructured":"Jianlin Su Yu Lu Shengfeng Pan Ahmed Murtadha Bo Wen and Yunfeng Liu. 2021. Roformer: Enhanced transformer with rotary position embedding. ArXiv:2104.09864. Retrieved from https:\/\/arxiv.org\/abs\/2104.09864"},{"key":"e_1_3_1_24_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, Vol. 30.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_25_2","unstructured":"Petar Veli\u010dkovi\u0107 Guillem Cucurull Arantxa Casanova Adriana Romero Pietro Lio and Yoshua Bengio. 2017. Graph attention networks. ArXiv:1710.10903. Retrieved from https:\/\/arxiv.org\/abs\/1710.10903"},{"key":"e_1_3_1_26_2","first-page":"13266","article-title":"Representing long-range context for graph neural networks with global attention","volume":"34","author":"Wu Zhanghao","year":"2021","unstructured":"Zhanghao Wu, Paras Jain, Matthew Wright, Azalia Mirhoseini, Joseph E. Gonzalez, and Ion Stoica. 2021. Representing long-range context for graph neural networks with global attention. In Advances in Neural Information Processing Systems, Vol. 34, 13266\u201313279.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_27_2","unstructured":"Qitian Wu Chenxiao Yang Wentao Zhao Yixuan He David Wipf and Junchi Yan. 2023. Difformer: Scalable (graph) transformers induced by energy constrained diffusion. arXiv:2301.09474. Retrieved from https:\/\/arxiv.org\/abs\/2301.09474"},{"key":"e_1_3_1_28_2","first-page":"5453","volume-title":"International Conference on Machine Learning","author":"Xu Keyulu","year":"2018","unstructured":"Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. 2018. Representation learning on graphs with jumping knowledge networks. In International Conference on Machine Learning. PMLR, 5453\u20135462."},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM54844.2022.00169"},{"key":"e_1_3_1_30_2","first-page":"10936","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"37","author":"Yeh Pei-Kai","year":"2023","unstructured":"Pei-Kai Yeh, Hsi-Wen Chen, and Ming-Syan Chen. 2023. Random walk conformer: Learning graph representation from long and short range. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, 10936\u201310944."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2022.05.026"},{"key":"e_1_3_1_32_2","first-page":"21171","article-title":"Hierarchical graph transformer with adaptive node sampling","volume":"35","author":"Zhang Zaixi","year":"2022","unstructured":"Zaixi Zhang, Qi Liu, Qingyong Hu, and Chee-Kong Lee. 2022. Hierarchical graph transformer with adaptive node sampling. In Advances in Neural Information Processing Systems, Vol. 35, 21171\u201321183.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_33_2","unstructured":"Jiawei Zhang Haopeng Zhang Congying Xia and Li Sun. 2020. Graph-BERT: Only attention is needed for learning graph representations. ArXiv:2001.05140. Retrieved from https:\/\/arxiv.org\/abs\/2001.05140"},{"key":"e_1_3_1_34_2","unstructured":"Jianan Zhao Chaozhuo Li Qianlong Wen Yiqi Wang Yuming Liu Hao Sun Xing Xie and Yanfang Ye. 2021. Gophormer: Ego-graph transformer for node classification. ArXiv:2110.13094. Retrieved from https:\/\/arxiv.org\/abs\/2110.13094"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3727882","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,16]],"date-time":"2025-06-16T16:57:11Z","timestamp":1750093031000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3727882"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,16]]},"references-count":33,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3727882"],"URL":"https:\/\/doi.org\/10.1145\/3727882","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"type":"print","value":"1556-4681"},{"type":"electronic","value":"1556-472X"}],"subject":[],"published":{"date-parts":[[2025,6,16]]},"assertion":[{"value":"2024-08-29","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-16","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-06-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}