{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T06:38:34Z","timestamp":1780468714207,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":126,"publisher":"ACM","funder":[{"name":"Silicon Valley Community Foundation"},{"DOI":"10.13039\/501100006374","name":"Amazon research award","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Yale AI Engineering Research Grant from Yale Office of the Provost"},{"name":"LEAP-U Sponsored Research from Samsung Research America"},{"name":"National Science Foundation (NSF) IIS Div Of Information & Intelligent Systems","award":["2403317"],"award-info":[{"award-number":["2403317"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,8,3]]},"DOI":"10.1145\/3711896.3736564","type":"proceedings-article","created":{"date-parts":[[2025,8,3]],"date-time":"2025-08-03T20:52:41Z","timestamp":1754254361000},"page":"6021-6031","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Hyperbolic Deep Learning for Foundation Models: A Survey"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-3193-2448","authenticated-orcid":false,"given":"Neil","family":"He","sequence":"first","affiliation":[{"name":"Yale University, New Haven, Connecticut, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6701-6782","authenticated-orcid":false,"given":"Hiren","family":"Madhu","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, Connecticut, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4345-6003","authenticated-orcid":false,"given":"Ngoc","family":"Bui","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, Connecticut, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2510-5282","authenticated-orcid":false,"given":"Menglin","family":"Yang","sequence":"additional","affiliation":[{"name":"Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5856-5229","authenticated-orcid":false,"given":"Rex","family":"Ying","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, Connecticut, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,8,3]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bty206"},{"key":"e_1_3_2_1_2_1","volume-title":"Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, and Fahad Shahbaz Khan.","author":"Awais Muhammad","year":"2025","unstructured":"Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, and Fahad Shahbaz Khan. 2025. Foundation Models Defining a New Era in Vision: a Survey and Outlook. TPAMI(2025)."},{"key":"e_1_3_2_1_3_1","volume-title":"Jamie Ryan Kiros, and Geoffrey E. Hinton","author":"Ba Jimmy Lei","year":"2016","unstructured":"Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. arXiv arXiv:1607.06450(2016)."},{"key":"e_1_3_2_1_4_1","first-page":"12316","article-title":"Modeling heterogeneous hierarchies with relation-specific hyperbolic cones","volume":"34","author":"Bai Yushi","year":"2021","unstructured":"Yushi Bai, Zhitao Ying, Hongyu Ren, and Jure Leskovec. 2021. Modeling heterogeneous hierarchies with relation-specific hyperbolic cones. NeurIPS, Vol. 34 (2021), 12316-12327.","journal-title":"NeurIPS"},{"key":"e_1_3_2_1_5_1","unstructured":"Federico Barbero Alex Vitvitskyi Christos Perivolaropoulos Razvan Pascanu and Petar Velickovic. 2025. Round and Round We Go! What makes Rotary Positional Encodings useful? arXiv:2410.06205(2025)."},{"key":"e_1_3_2_1_6_1","unstructured":"Ahmad Bdeir Kristian Schwethelm and Niels Landwehr. 2024. Fully Hyperbolic Convolutional Neural Networks for Computer Vision. In ICLR."},{"key":"e_1_3_2_1_7_1","volume-title":"ICLR 2024 Workshop on Machine Learning for Genomics Explorations.","author":"Bhasker Nithya","year":"2024","unstructured":"Nithya Bhasker, Hattie Chung, Louis Boucherie, Vladislav Kim, Stefanie Speidel, and Melanie Weber. 2024. Contrastive poincar\u00e9 maps for single-cell data analysis. In ICLR 2024 Workshop on Machine Learning for Genomics Explorations."},{"key":"e_1_3_2_1_8_1","unstructured":"Rishi Bommasani and et al. 2021. On the Opportunities and Risks of Foundation Models. arXiv preprint Vol. arXiv:2108.07258 (2021)."},{"key":"e_1_3_2_1_9_1","first-page":"1045","article-title":"Latent variable modelling with hyperbolic normalizing flows","author":"Bose Joey","year":"2020","unstructured":"Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, and Will Hamilton. 2020. Latent variable modelling with hyperbolic normalizing flows. In ICML. PMLR, 1045-1055.","journal-title":"ICML. PMLR"},{"key":"e_1_3_2_1_10_1","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al., 2020. Language models are few-shot learners. NeurIPS, Vol. 33 (2020), 1877-1901.","journal-title":"NeurIPS"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Davide Caffagni Federico Cocchi Luca Barsellotti Nicholas Moratelli Sara Sarto Lorenzo Baraldi Marcella Cornia and Rita Cucchiara. 2024. The revolution of multimodal large language models: a survey. arXiv:2402.12451(2024).","DOI":"10.18653\/v1\/2024.findings-acl.807"},{"key":"e_1_3_2_1_12_1","unstructured":"Hanqun Cao Cheng Tan Zhangyang Gao Yilun Xu Guangyong Chen Pheng-Ann Heng and Stan Z Li. 2024. A survey on generative diffusion models. TKDE(2024)."},{"key":"e_1_3_2_1_13_1","first-page":"4868","article-title":"Hyperbolic graph convolutional neural networks","author":"Chami Ines","year":"2019","unstructured":"Ines Chami, Zhitao Ying, Christopher R\u00e9, and Jure Leskovec. 2019. Hyperbolic graph convolutional neural networks. In NeurIPS. 4868-4879.","journal-title":"NeurIPS."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Jiayang Chen Zhihang Hu Siqi Sun Qingxiong Tan Yixuan Wang Qinze Yu Licheng Zong Liang Hong Jin Xiao Tao Shen et al. 2022a. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. arXiv:2204.00300(2022).","DOI":"10.1101\/2022.08.06.503062"},{"key":"e_1_3_2_1_15_1","first-page":"1725","article-title":"Simple and deep graph convolutional networks","author":"Chen Ming","year":"2020","unstructured":"Ming Chen, Zhewei Wei, Zengfeng Huang, Bolin Ding, and Yaliang Li. 2020b. Simple and deep graph convolutional networks. In ICML. PMLR, 1725-1735.","journal-title":"ICML. PMLR"},{"key":"e_1_3_2_1_16_1","volume-title":"Chen and Yaron Lipman","author":"Ricky T.","year":"2024","unstructured":"Ricky T. Q. Chen and Yaron Lipman. 2024. Flow Matching on General Geometries. In ICLR."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2024.3407575"},{"key":"e_1_3_2_1_18_1","unstructured":"Weize Chen Xu Han Yankai Lin Hexu Zhao Zhiyuan Liu Peng Li Maosong Sun and Jie Zhou. 2021. Fully Hyperbolic Neural Networks. arXiv:2105.14686(2021)."},{"key":"e_1_3_2_1_19_1","unstructured":"Xinlei Chen Haoqi Fan Ross Girshick and Kaiming He. 2020a. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297(2020)."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Yankai Chen Menglin Yang Yingxue Zhang Mengchen Zhao Ziqiao Meng Jianye Hao and Irwin King. 2022b. Modeling Scale-free Graphs for Knowledge-aware Recommendation. WSDM(2022).","DOI":"10.1145\/3488560.3498419"},{"key":"e_1_3_2_1_21_1","unstructured":"Jeffrey Cheng Marc Marone Orion Weller Dawn Lawrie Daniel Khashabi and Benjamin Van Durme. 2024. Dated data: Tracing knowledge cutoffs in large language models. arXiv:2403.12958(2024)."},{"key":"e_1_3_2_1_22_1","unstructured":"Andy Coenen Emily Reif Ann Yuan Been Kim Adam Pearce Fernanda Vi\u00e9gas and Martin Wattenberg. 2019. Visualizing and Measuring the Geometry of BERT. NuerIPS(2019)."},{"key":"e_1_3_2_1_23_1","volume-title":"Radu Tudor Ionescu, and Mubarak Shah","author":"Croitoru Florinel-Alin","year":"2023","unstructured":"Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. 2023. Diffusion models in vision: A survey. TPAMI(2023)."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-024-02201-0"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Jindou Dai Yuwei Wu Zhi Gao and Yunde Jia. 2021. A Hyperbolic-to-Hyperbolic Graph Convolutional Network. arXiv:2104.06942(2021) 154-163.","DOI":"10.1109\/CVPR46437.2021.00022"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Tri Dao Daniel Y. Fu Stefano Ermon Atri Rudra and Christopher R\u00e9. 2022. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. In NeurIPS.","DOI":"10.52202\/068431-1189"},{"key":"e_1_3_2_1_27_1","volume-title":"Yee Whye Teh, and Arnaud Doucet","author":"Bortoli Valentin De","year":"2022","unstructured":"Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, and Arnaud Doucet. 2022. Riemannian Score-Based Generative Modelling. In NeurIPS."},{"key":"e_1_3_2_1_28_1","unstructured":"DeepSeek-AI. 2024. DeepSeek-V3 Technical Report. arXiv:2412.19437(2024)."},{"key":"e_1_3_2_1_29_1","first-page":"7694","article-title":"Hyperbolic image-text representations","author":"Desai Karan","year":"2023","unstructured":"Karan Desai, Maximilian Nickel, Tanmay Rajpurohit, Justin Johnson, and Shanmukha Ramakrishna Vedantam. 2023. Hyperbolic image-text representations. In ICML. PMLR, 7694-7731.","journal-title":"ICML. PMLR"},{"key":"e_1_3_2_1_30_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805(2018).","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805(2018)."},{"key":"e_1_3_2_1_31_1","volume-title":"Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces. Nature communications","author":"Ding Jiarui","year":"2021","unstructured":"Jiarui Ding and Aviv Regev. 2021. Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces. Nature communications, Vol. 12, 1 (2021), 2554."},{"key":"e_1_3_2_1_32_1","unstructured":"Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929(2020)."},{"key":"e_1_3_2_1_33_1","unstructured":"Darren Edge Ha Trinh Newman Cheng Joshua Bradley Alex Chao Apurva Mody and Steven Truitt. 2024. From Local to Global: A Graph RAG Approach to Query-Focused Summarization. arXiv:2404.16130(2024)."},{"key":"e_1_3_2_1_34_1","first-page":"7409","article-title":"Hyperbolic vision transformers: Combining improvements in metric learning","author":"Ermolov Aleksandr","year":"2022","unstructured":"Aleksandr Ermolov, Leyla Mirvakhabova, Valentin Khrulkov, Nicu Sebe, and Ivan Oseledets. 2022. Hyperbolic vision transformers: Combining improvements in metric learning. In CVPR. 7409-7419.","journal-title":"CVPR."},{"key":"e_1_3_2_1_35_1","volume-title":"HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space. arXiv:2409.16897(2024).","author":"Fein-Ashley Jacob","year":"2024","unstructured":"Jacob Fein-Ashley, Ethan Feng, and Minh Pham. 2024. HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space. arXiv:2409.16897(2024)."},{"key":"e_1_3_2_1_36_1","unstructured":"Xingcheng Fu Yisen Gao Yuecen Wei Qingyun Sun Hao Peng Jianxin Li and Xianxian Li. 2024. Hyperbolic Geometric Latent Diffusion Model for Graph Generation. ICML(2024)."},{"key":"e_1_3_2_1_37_1","first-page":"1646","article-title":"Hyperbolic entailment cones for learning hierarchical embeddings","author":"Ganea Octavian","year":"2018","unstructured":"Octavian Ganea, Gary B\u00e9cigneul, and Thomas Hofmann. 2018a. Hyperbolic entailment cones for learning hierarchical embeddings. In ICML. PMLR, 1646-1655.","journal-title":"ICML. PMLR"},{"key":"e_1_3_2_1_38_1","first-page":"5345","article-title":"Hyperbolic neural networks","author":"Ganea Octavian","year":"2018","unstructured":"Octavian Ganea, Gary B\u00e9cigneul, and Thomas Hofmann. 2018b. Hyperbolic neural networks. In NeurIPS. 5345-5355.","journal-title":"NeurIPS."},{"key":"e_1_3_2_1_39_1","volume-title":"Retrieval-augmented generation for large language models: A survey. arXiv:2312.10997","author":"Gao Yunfan","year":"2023","unstructured":"Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Haofen Wang, and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey. arXiv:2312.10997, Vol. 2 (2023)."},{"key":"e_1_3_2_1_40_1","first-page":"6840","article-title":"Hyperbolic contrastive learning for visual representations beyond objects","author":"Ge Songwei","year":"2023","unstructured":"Songwei Ge, Shlok Mishra, Simon Kornblith, Chun-Liang Li, and David Jacobs. 2023. Hyperbolic contrastive learning for visual representations beyond objects. In CVPR. 6840-6849.","journal-title":"CVPR."},{"key":"e_1_3_2_1_41_1","volume-title":"Simon Kornblith, Chun-Liang Li, and David Jacobs.","author":"Ge Songwei","year":"2022","unstructured":"Songwei Ge, Shlok Kumar Mishra, Simon Kornblith, Chun-Liang Li, and David Jacobs. 2022. Hyperbolic Contrastive Learning for Visual Representations beyond Objects. ArXiv, Vol. abs\/2212.00653 (2022)."},{"key":"e_1_3_2_1_42_1","volume-title":"Properties of Euclidean and non-Euclidean distance matrices. Linear algebra and its applications","author":"Gower John Clifford","year":"1985","unstructured":"John Clifford Gower. 1985. Properties of Euclidean and non-Euclidean distance matrices. Linear algebra and its applications, Vol. 67 (1985), 81-97."},{"key":"e_1_3_2_1_43_1","unstructured":"Aaron Grattafiori Abhimanyu Dubey Abhinav Jauhri Abhinav Pandey Abhishek Kadian Ahmad Al-Dahle Aiesha Letman Akhil Mathur Alan Schelten Amy Yang Angela Fan et al. 2024. The Llama 3 Herd of Models. arXiv:2407.21783(2024)."},{"key":"e_1_3_2_1_44_1","volume-title":"Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, et al.","author":"Gulcehre Caglar","year":"2019","unstructured":"Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, et al., 2019. Hyperbolic attention networks. In ICLR."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Zihao Guo Qingyun Sun Haonan Yuan Xingcheng Fu Min Zhou Yisen Gao and Jianxin Li. 2025. GraphMoRE: Mitigating Topological Heterogeneity via Mixture of Riemannian Experts. In AAAI.","DOI":"10.1609\/aaai.v39i11.33279"},{"key":"e_1_3_2_1_46_1","unstructured":"Kaiming He Haoqi Fan Yuxin Wu Saining Xie and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In CVPR."},{"key":"e_1_3_2_1_47_1","volume-title":"HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts. arXiv preprint arXiv:2505.24722(2025).","author":"He Neil","year":"2025","unstructured":"Neil He, Rishabh Anand, Hiren Madhu, Ali Maatouk, Smita Krishnaswamy, Leandros Tassiulas, Menglin Yang, and Rex Ying. 2025a. HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts. arXiv preprint arXiv:2505.24722(2025)."},{"key":"e_1_3_2_1_48_1","volume-title":"Position: Beyond Euclidean-Foundation Models Should Embrace Non-Euclidean Geometries. arXiv:2504.08896(2025).","author":"He Neil","year":"2025","unstructured":"Neil He, Jiahong Liu, Buze Zhang, Ngoc Bui, Ali Maatouk, Menglin Yang, Irwin King, Melanie Weber, and Rex Ying. 2025b. Position: Beyond Euclidean-Foundation Models Should Embrace Non-Euclidean Geometries. arXiv:2504.08896(2025)."},{"key":"e_1_3_2_1_49_1","unstructured":"Neil He Menglin Yang and Rex Ying. 2025c. HyperCore: The Core Framework for Building Hyperbolic Foundation Models with Comprehensive Modules. arXiv:2504.08912(2025)."},{"key":"e_1_3_2_1_50_1","unstructured":"Neil He Menglin Yang and Rex Ying. 2025d. Lorentzian Residual Neural Networks. In KDD."},{"key":"e_1_3_2_1_51_1","unstructured":"Jonathan Ho Ajay Jain and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In NeurIPS."},{"key":"e_1_3_2_1_52_1","unstructured":"Jordan Hoffmann and et al. 2022. Training compute-optimal large language models. In NeurIPS. Red Hook NY USA Article 2176 15 pages."},{"key":"e_1_3_2_1_53_1","unstructured":"Edward J Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. In ICLR."},{"key":"e_1_3_2_1_54_1","volume-title":"Prakash Panangaden, and Aaron Courville.","author":"Huang Chin-Wei","year":"2022","unstructured":"Chin-Wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, and Aaron Courville. 2022. Riemannian Diffusion Models. In NeurIPS."},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3703155"},{"key":"e_1_3_2_1_56_1","first-page":"448","article-title":"Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML. 448-456.","journal-title":"ICML."},{"key":"e_1_3_2_1_57_1","unstructured":"Jaehyeong Jo and Hwang Sung Ju Lee Seul and. 2022. Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations. In ICML."},{"key":"e_1_3_2_1_58_1","unstructured":"Sayash Kapoor Rishi Bommasani Kevin Klyman Shayne Longpre Ashwin Ramaswami Peter Cihon Aspen Hopkins Kevin Bankston Stella Biderman Miranda Bogen Rumman Chowdhury Alex Engler Peter Henderson Yacine Jernite Seth Lazar Stefano Maffulli Alondra Nelson Joelle Pineau Aviya Skowron Dawn Song Victor Storchan Daniel Zhang Daniel E. Ho Percy Liang and Arvind Narayanan. 2023. On the Societal Impact of Open Foundation Models. arXiv:2304.11082(2023)."},{"key":"e_1_3_2_1_59_1","unstructured":"W Sean Kennedy Onuttom Narayan and Iraj Saniee. 2013. On the hyperbolicity of large-scale networks. arXiv:1307.0031(2013)."},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"crossref","unstructured":"Bobak Kiani Jason Wang and Melanie Weber. 2024. Hardness of Learning Neural Networks under the Manifold Hypothesis. In NeurIPS.","DOI":"10.52202\/079017-0184"},{"key":"e_1_3_2_1_61_1","unstructured":"Minseon Kim Jihoon Tack and Sung Ju Hwang. 2020. Adversarial Self-Supervised Contrastive Learning. In NeurIPS."},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1561\/2200000056"},{"key":"e_1_3_2_1_63_1","volume-title":"Geoopt: Riemannian Optimization in PyTorch. arXiv:2005.02819(2020).","author":"Kochurov Max","year":"2020","unstructured":"Max Kochurov, Rasul Karimov, and Sergei Kozlukov. 2020. Geoopt: Riemannian Optimization in PyTorch. arXiv:2005.02819(2020)."},{"key":"e_1_3_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.82.036106"},{"key":"e_1_3_2_1_65_1","first-page":"3672","article-title":"Lorentzian distance learning for hyperbolic representations","author":"Law Marc","year":"2019","unstructured":"Marc Law, Renjie Liao, Jake Snell, and Richard Zemel. 2019. Lorentzian distance learning for hyperbolic representations. In ICML. PMLR, 3672-3681.","journal-title":"ICML. PMLR"},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3094723"},{"key":"e_1_3_2_1_67_1","first-page":"3231","article-title":"Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings","author":"Le Matthew","year":"2019","unstructured":"Matthew Le, Stephen Roller, Laetitia Papaxanthos, Douwe Kiela, and Maximilian Nickel. 2019. Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings. In ACL. 3231-3241.","journal-title":"ACL."},{"key":"e_1_3_2_1_68_1","unstructured":"Holden Lee Jianfeng Lu and Yixin Tan. 2022. Convergence for score-based generative modeling with polynomial complexity. In NeurIPS."},{"key":"e_1_3_2_1_69_1","first-page":"9459","author":"Lewis Patrick","year":"2020","unstructured":"Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen-tau Yih, Tim Rockt\u00e4schel, et al., 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In NeurIPS. 9459-9774.","journal-title":"Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In NeurIPS."},{"key":"e_1_3_2_1_70_1","unstructured":"Junnan Li Dongxu Li Silvio Savarese and Steven Hoi. 2023a. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. In ICML."},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"crossref","unstructured":"Lingxiao Li Yi Zhang and Shuhui Wang. 2023b. The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation. In ICCV.","DOI":"10.1109\/ICCV51070.2023.02076"},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01200757"},{"key":"e_1_3_2_1_73_1","unstructured":"Yaron Lipman Ricky T. Q. Chen Heli Ben-Hamu Maximilian Nickel and Matt Le. 2022. Flow Matching for Generative Modeling. arXiv:2210.02747(2022)."},{"key":"e_1_3_2_1_74_1","volume-title":"NeurIPS 2nd SSL workshop.","author":"Liu Jiahong","year":"2022","unstructured":"Jiahong Liu, Menglin Yang, Min Zhou, Shanshan Feng, and Philippe Fournier-Viger. 2022. Enhancing Hyperbolic Graph Embeddings via Contrastive Learning. In NeurIPS 2nd SSL workshop."},{"key":"e_1_3_2_1_75_1","first-page":"8230","article-title":"Hyperbolic graph neural networks","author":"Liu Qi","year":"2019","unstructured":"Qi Liu, Maximilian Nickel, and Douwe Kiela. 2019. Hyperbolic graph neural networks. In NeurIPS. 8230-8241.","journal-title":"NeurIPS."},{"key":"e_1_3_2_1_76_1","unstructured":"Aaron Lou Isay Katsman Qingxuan Jiang Serge Belongie Ser-Nam Lim and Christopher De Sa. 2020a. Differentiating through the Frechet Mean. arXiv:2003.00335(2020)."},{"key":"e_1_3_2_1_77_1","first-page":"6393","article-title":"Differentiating through the Fr\u00e9chet Mean","author":"Lou Aaron","year":"2020","unstructured":"Aaron Lou, Isay Katsman, Qingxuan Jiang, Serge Belongie, Ser-Nam Lim, and Christopher De Sa. 2020b. Differentiating through the Fr\u00e9chet Mean. In ICML. 6393-6403.","journal-title":"ICML."},{"key":"e_1_3_2_1_78_1","unstructured":"Qiyao Ma Menglin Yang Mingxuan Ju Tong Zhao Neil Shah and Rex Ying. 2024. HARec: Hyperbolic graph-llm alignment for exploration and exploitation in recommender systems. arXiv:2411.13865(2024)."},{"key":"e_1_3_2_1_79_1","doi-asserted-by":"crossref","unstructured":"Paolo Mandica Luca Franco Konstantinos Kallidromitis Suzanne Petryk and Fabio Galasso. 2024. Hyperbolic Learning with Multimodal Large Language Models. arXiv:2408.05097(2024).","DOI":"10.1007\/978-3-031-91585-7_23"},{"key":"e_1_3_2_1_80_1","volume-title":"Chris J. Maddison, Ryota Tomioka, and Yee Whye Teh.","author":"Mathieu Emile","year":"2019","unstructured":"Emile Mathieu, Charline Le Lan, Chris J. Maddison, Ryota Tomioka, and Yee Whye Teh. 2019. Continuous Hierarchical Representations with Poincar\u00e9 Variational Auto-Encoders. In NeurIPS."},{"key":"e_1_3_2_1_81_1","volume-title":"Martin Keller-Ressel, Jeffrey Gu, and Serena Yeung.","author":"Mettes Pascal","year":"2024","unstructured":"Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, and Serena Yeung. 2024. Hyperbolic deep learning in computer vision: A survey. International Journal of Computer Vision(2024), 1-25."},{"key":"e_1_3_2_1_82_1","first-page":"1","volume-title":"JMLR","volume":"21","author":"Miolane Nina","year":"2020","unstructured":"Nina Miolane, Nicolas Guigui, Alice Le Brigant, Johan Mathe, Benjamin Hou, Yann Thanwerdas, Stefan Heyder, Olivier Peltre, Niklas Koep, Hadi Zaatiti, Hatem Hajri, Yann Cabanes, Thomas Gerald, Paul Chauchat, Christian Shewmake, Daniel Brooks, Bernhard Kainz, Claire Donnat, Susan Holmes, and Xavier Pennec. 2020. Geomstats: A Python Package for Riemannian Geometry in Machine Learning. JMLR, Vol. 21, 223 (2020), 1-9."},{"key":"e_1_3_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-023-04203-7"},{"key":"e_1_3_2_1_84_1","first-page":"4693","article-title":"A wrapped normal distribution on hyperbolic space for gradient-based learning","author":"Nagano Yoshihiro","year":"2019","unstructured":"Yoshihiro Nagano, Shoichiro Yamaguchi, Yasuhiro Fujita, and Masanori Koyama. 2019. A wrapped normal distribution on hyperbolic space for gradient-based learning. In ICML. PMLR, 4693-4702.","journal-title":"ICML. PMLR"},{"key":"e_1_3_2_1_85_1","first-page":"6338","article-title":"Poincar\u00e9 embeddings for learning hierarchical representations","author":"Nickel Maximillian","year":"2017","unstructured":"Maximillian Nickel and Douwe Kiela. 2017. Poincar\u00e9 embeddings for learning hierarchical representations. In NeurIPS. 6338-6347.","journal-title":"NeurIPS."},{"key":"e_1_3_2_1_86_1","first-page":"3779","article-title":"Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry","author":"Nickel Maximillian","year":"2018","unstructured":"Maximillian Nickel and Douwe Kiela. 2018. Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry. In ICML. 3779-3788.","journal-title":"ICML."},{"key":"e_1_3_2_1_87_1","unstructured":"Keiron O'Shea and Ryan Nash. 2015. An Introduction to Convolutional Neural Networks. arXiv:1511.08458(2015)."},{"key":"e_1_3_2_1_88_1","volume-title":"Alessandro Flaborea, Fabio Galasso, and Pascal Mettes.","author":"Pal Avik","year":"2025","unstructured":"Avik Pal, Max van Spengler, Guido Maria D'Amely di Melendugno, Alessandro Flaborea, Fabio Galasso, and Pascal Mettes. 2025. Compositional Entailment Learning for Hyperbolic Vision-Language Models. ICLR(2025)."},{"key":"e_1_3_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFCOM.2010.5462131"},{"key":"e_1_3_2_1_90_1","unstructured":"Wei Peng Tuomas Varanka Abdelrahman Mostafa Henglin Shi and Guoying Zhao. 2021. Hyperbolic deep neural networks: A survey. TPAMI(2021)."},{"key":"e_1_3_2_1_91_1","first-page":"127","article-title":"Intrinsic Statistics on Riemannian Manifolds","volume":"25","author":"Pennec Xavier","year":"2006","unstructured":"Xavier Pennec. 2006. Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements. JMIV, Vol. 25, 1 (2006), 127-154.","journal-title":"Basic Tools for Geometric Measurements. JMIV"},{"key":"e_1_3_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-023-27995-5"},{"key":"e_1_3_2_1_93_1","unstructured":"Eric Qu and Dongmian Zou. 2022a. Hyperbolic Neural Networks for Molecular Generation. arXiv:2201.12825(2022)."},{"key":"e_1_3_2_1_94_1","unstructured":"Eric Qu and Dongmian Zou. 2022b. Lorentzian fully hyperbolic generative adversarial network. arXiv:2201.12825(2022)."},{"key":"e_1_3_2_1_95_1","volume-title":"Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever.","author":"Radford","year":"2021","unstructured":"Radford, Kim Jong Wook Alec, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020(2021)."},{"key":"e_1_3_2_1_96_1","first-page":"3505","article-title":"DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters","author":"Rasley Jeff","year":"2020","unstructured":"Jeff Rasley, Samyam Rajbhandari, Olatunji Ruwase, and Yuxiong He. 2020. DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters. In KDD. 3505-3506.","journal-title":"KDD."},{"key":"e_1_3_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.3390\/e16074015"},{"key":"e_1_3_2_1_98_1","first-page":"4460","article-title":"Representation Tradeoffs for Hyperbolic Embeddings","author":"Sala Frederic","year":"2018","unstructured":"Frederic Sala, Chris De Sa, Albert Gu, and Christopher Re. 2018. Representation Tradeoffs for Hyperbolic Embeddings. In ICML. 4460-4469.","journal-title":"ICML."},{"key":"e_1_3_2_1_99_1","volume-title":"Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures. arXiv:2407.09468(2024).","author":"Sanborn Sophia","year":"2024","unstructured":"Sophia Sanborn, Johan Mathe, Mathilde Papillon, Domas Buracas, Hansen J. Lillemark, Christian Shewmake, Abby Bertics, Xavier Pennec, and Nina Miolane. 2024. Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures. arXiv:2407.09468(2024)."},{"key":"e_1_3_2_1_100_1","volume-title":"International Symposium on Graph Drawing. Springer, 355-366","author":"Sarkar Rik","year":"2011","unstructured":"Rik Sarkar. 2011. Low distortion delaunay embedding of trees in hyperbolic plane. In International Symposium on Graph Drawing. Springer, 355-366."},{"key":"e_1_3_2_1_101_1","unstructured":"Ryohei Shimizu Yusuke Mukuta and Tatsuya Harada. 2020. Hyperbolic Neural Networks. In ICLR."},{"key":"e_1_3_2_1_102_1","unstructured":"Yang Song Jascha Sohl-Dickstein Diederik P Kingma Abhishek Kumar Stefano Ermon and Ben Poole. 2021. Score-Based Generative Modeling through Stochastic Differential Equations. In ICLR."},{"key":"e_1_3_2_1_103_1","unstructured":"Jianlin Su Yu Lu Shengfeng Pan Ahmed Murtadha Bo Wen and Yunfeng Liu. 2021. RoFormer: Enhanced Transformer with Rotary Position Embedding. arXiv:2104.09864(2021)."},{"key":"e_1_3_2_1_104_1","unstructured":"Xunzhu Tang Saad Ezzini Haoye Tian Yewei Song Jacques Klein Tegawende F Bissyande et al. 2023. Hyperbolic code retrieval: a novel approach for efficient code search using hyperbolic space embeddings. arXiv:2308.15234(2023)."},{"key":"e_1_3_2_1_105_1","unstructured":"Alexandru Tifrea Gary B\u00e9cigneul and Octavian-Eugen Ganea. 2018. Poincar\u00e9 glove: Hyperbolic word embeddings. arXiv:1810.06546(2018)."},{"key":"e_1_3_2_1_106_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-02396-5"},{"key":"e_1_3_2_1_107_1","doi-asserted-by":"crossref","unstructured":"Max van Spengler Erwin Berkhout and Pascal Mettes. 2023. Poincar\u00e9 ResNet. CVPR(2023).","DOI":"10.1109\/ICCV51070.2023.00499"},{"key":"e_1_3_2_1_108_1","first-page":"5998","article-title":"Attention is all you need","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998-6008.","journal-title":"NeurIPS."},{"key":"e_1_3_2_1_109_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btaa701"},{"key":"e_1_3_2_1_110_1","doi-asserted-by":"crossref","unstructured":"Xinlong Wang Rufeng Zhang Chunhua Shen Tao Kong and Lei Li. 2021. Dense contrastive learning for self-supervised visual pre-training. In CVPR.","DOI":"10.1109\/CVPR46437.2021.00304"},{"key":"e_1_3_2_1_111_1","unstructured":"Lingfeng Wen Xuan Tang Mingjie Ouyang Xiangxiang Shen Jian Yang Daxin Zhu Mingsong Chen and Xian Wei. 2024. Hyperbolic Graph Diffusion Model. In AAAI."},{"key":"e_1_3_2_1_112_1","volume-title":"Yew Soon Ong, and Chen Change Loy","author":"Xie Jiahao","year":"2021","unstructured":"Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, and Chen Change Loy. 2021. Unsupervised object-level representation learning from scene images. In NeurIPS."},{"key":"e_1_3_2_1_113_1","volume-title":"Hyperbolic Fine-tuning for Large Language Models. ICML LLM Cognition Workshop(2024)","author":"Yang Menglin","year":"2024","unstructured":"Menglin Yang, Aosong Feng, Bo Xiong, Jihong Liu, Irwin King, and Rex Ying. 2024a. Hyperbolic Fine-tuning for Large Language Models. ICML LLM Cognition Workshop(2024)."},{"key":"e_1_3_2_1_114_1","first-page":"2212","article-title":"Hicf: Hyperbolic informative collaborative filtering","author":"Yang Menglin","year":"2022","unstructured":"Menglin Yang, Zhihao Li, Min Zhou, Jiahong Liu, and Irwin King. 2022a. Hicf: Hyperbolic informative collaborative filtering. In KDD. 2212-2221.","journal-title":"KDD."},{"key":"e_1_3_2_1_115_1","first-page":"3770","article-title":"Hypformer: Exploring efficient transformer fully in hyperbolic space","author":"Yang Menglin","year":"2024","unstructured":"Menglin Yang, Harshit Verma, Delvin Ce Zhang, Jiahong Liu, Irwin King, and Rex Ying. 2024b. Hypformer: Exploring efficient transformer fully in hyperbolic space. In KDD. 3770-3781.","journal-title":"KDD."},{"key":"e_1_3_2_1_116_1","first-page":"1975","article-title":"Discrete-time Temporal Network Embedding via Implicit Hierarchical Learning in Hyperbolic Space","author":"Yang Menglin","year":"2021","unstructured":"Menglin Yang, Min Zhou, Marcus Kalander, Zengfeng Huang, and Irwin King. 2021. Discrete-time Temporal Network Embedding via Implicit Hierarchical Learning in Hyperbolic Space. In KDD. 1975-1985.","journal-title":"KDD."},{"key":"e_1_3_2_1_117_1","unstructured":"Menglin Yang Min Zhou Zhihao Li Jiahong Liu Lujia Pan Hui Xiong and Irwin King. 2022b. Hyperbolic graph neural networks: a review of methods and applications. arXiv:2202.13852(2022)."},{"key":"e_1_3_2_1_118_1","volume-title":"HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric Regularization. In WebConf.","author":"Yang Menglin","year":"2022","unstructured":"Menglin Yang, Min Zhou, Jiahong Liu, Defu Lian, and Irwin King. 2022c. HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric Regularization. In WebConf."},{"key":"e_1_3_2_1_119_1","unstructured":"Menglin Yang Min Zhou Hui Xiong and Irwin King. 2022 d. Hyperbolic Temporal Network Embedding. TKDE(2022)."},{"key":"e_1_3_2_1_120_1","unstructured":"Yun Yue Fangzhou Lin Kazunori D. Yamada and Ziming Zhang. 2023. Hyperbolic Contrastive Learning. arXiv:2302.01409(2023)."},{"key":"e_1_3_2_1_121_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2024.3369699"},{"key":"e_1_3_2_1_122_1","doi-asserted-by":"crossref","unstructured":"Yiding Zhang Xiao Wang Chuan Shi Xunqiang Jiang and Yanfang Fanny Ye. 2021a. Hyperbolic graph attention network. TBD(2021).","DOI":"10.1109\/TBDATA.2021.3081431"},{"key":"e_1_3_2_1_123_1","first-page":"1249","article-title":"Lorentzian Graph Convolutional Networks","author":"Zhang Yiding","year":"2021","unstructured":"Yiding Zhang, Xiao Wang, Chuan Shi, Nian Liu, and Guojie Song. 2021b. Lorentzian Graph Convolutional Networks. In WebConf. 1249-1261.","journal-title":"WebConf."},{"key":"e_1_3_2_1_124_1","first-page":"1984","article-title":"Understanding and mitigating hyperbolic dimensional collapse in graph contrastive learning","author":"Zhang Yifei","year":"2025","unstructured":"Yifei Zhang, Hao Zhu, Menglin Yang, Jiahong Liu, Rex Ying, Irwin King, and Piotr Koniusz. 2025. Understanding and mitigating hyperbolic dimensional collapse in graph contrastive learning. In KDD. 1984-1995.","journal-title":"KDD."},{"key":"e_1_3_2_1_125_1","unstructured":"Wayne Xin Zhao Kun Zhou Junyi Li Tianyi Tang Xiaolei Wang Yupeng Hou Yingqian Min Beichen Zhang Junjie Zhang Zican Dong et al. 2023. A survey of large language models. arXiv:2303.18223 Vol. 1 2 (2023)."},{"key":"e_1_3_2_1_126_1","first-page":"7548","article-title":"Graph Geometry Interaction Learning","volume":"33","author":"Zhu Shichao","year":"2020","unstructured":"Shichao Zhu, Shirui Pan, Chuan Zhou, Jia Wu, Yanan Cao, and Bin Wang. 2020. Graph Geometry Interaction Learning. In NeurIPS, Vol. 33. 7548-7558.","journal-title":"NeurIPS"}],"event":{"name":"KDD '25: The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining","location":"Toronto ON Canada","acronym":"KDD '25","sponsor":["SIGKDD ACM Special Interest Group on Knowledge Discovery in Data","SIGMOD ACM Special Interest Group on Management of Data"]},"container-title":["Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3711896.3736564","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T17:58:48Z","timestamp":1777571928000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3711896.3736564"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,3]]},"references-count":126,"alternative-id":["10.1145\/3711896.3736564","10.1145\/3711896"],"URL":"https:\/\/doi.org\/10.1145\/3711896.3736564","relation":{},"subject":[],"published":{"date-parts":[[2025,8,3]]},"assertion":[{"value":"2025-08-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}