{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,20]],"date-time":"2025-09-20T21:06:44Z","timestamp":1758402404571,"version":"3.44.0"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:p>\n            Approximate nearest neighbor search (ANNS) is a fundamental problem in vector databases and AI infrastructures. Recent graph-based ANNS algorithms have achieved high search accuracy with practical efficiency. Despite the advancements, these algorithms still face performance bottlenecks in production, due to the random memory access patterns of graph-based search and the high computational overheads of vector distance. In addition, the performance of a graph-based ANNS algorithm is highly sensitive to parameters, while selecting the optimal parameters is cost-prohibitive, e.g., manual tuning requires repeatedly re-building the index. This paper introduces\n            <jats:italic toggle=\"yes\">VSAG<\/jats:italic>\n            , an open-source framework that aims to enhance the in production performance of graph-based ANNS algorithms.\n            <jats:italic toggle=\"yes\">VSAG<\/jats:italic>\n            has been deployed at scale in the services of Ant Group, and it incorporates three key optimizations: (\n            <jats:italic toggle=\"yes\">i) efficient memory access<\/jats:italic>\n            : it reduces L3 cache misses with pre-fetching and cache-friendly vector organization; (\n            <jats:italic toggle=\"yes\">ii) automated parameter tuning<\/jats:italic>\n            : it automatically selects performance-optimal parameters without requiring index rebuilding; (\n            <jats:italic toggle=\"yes\">iii) efficient distance computation<\/jats:italic>\n            : it leverages modern hardware, scalar quantization, and smartly switches to low-precision representation to dramatically reduce the distance computation costs. We evaluate\n            <jats:italic toggle=\"yes\">VSAG<\/jats:italic>\n            on real-world datasets. The experimental results show that\n            <jats:italic toggle=\"yes\">VSAG<\/jats:italic>\n            achieves the state-of-the-art performance and provides up to 4\u00d7 speedup over HNSWlib (an industry-standard library) while ensuring the same accuracy.\n          <\/jats:p>","DOI":"10.14778\/3750601.3750624","type":"journal-article","created":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:38:05Z","timestamp":1758029885000},"page":"5017-5030","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["VSAG: An Optimized Search Framework for Graph-Based Approximate Nearest Neighbor Search"],"prefix":"10.14778","volume":"18","author":[{"given":"Xiaoyao","family":"Zhong","sequence":"first","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Haotian","family":"Li","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Jiabao","family":"Jin","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Mingyu","family":"Yang","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Deming","family":"Chu","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Xiangyu","family":"Wang","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Zhitao","family":"Shen","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"Wei","family":"Jia","sequence":"additional","affiliation":[{"name":"Ant Group, Shanghai, China"}]},{"given":"George","family":"Gu","sequence":"additional","affiliation":[{"name":"Intel Corporation, Shanghai, China"}]},{"given":"Yi","family":"Xie","sequence":"additional","affiliation":[{"name":"Intel Corporation, Shanghai, China"}]},{"given":"Xuemin","family":"Lin","sequence":"additional","affiliation":[{"name":"Shanghai Jiaotong University, Shanghai, China"}]},{"given":"Heng Tao","family":"Shen","sequence":"additional","affiliation":[{"name":"Tongji University, Shanghai, China"}]},{"given":"Jingkuan","family":"Song","sequence":"additional","affiliation":[{"name":"Tongji University, Shanghai, China"}]},{"given":"Peng","family":"Cheng","sequence":"additional","affiliation":[{"name":"Tongji University and East China Normal University, Shanghai, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,16]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Alipay. 2025. Ant Group. https:\/\/www.antgroup.com."},{"key":"e_1_2_1_2_1","unstructured":"Alipay. 2025. Face Recognition. https:\/\/open.alipay.com\/api\/detail?code=I1080300001000043632."},{"key":"e_1_2_1_3_1","unstructured":"Amazon. 2025. Product Search. https:\/\/www.amazon.com\/."},{"key":"e_1_2_1_4_1","first-page":"4","volume-title":"Proceedings of the VLDB Endowment 9","author":"Andr\u00e9 Fabien","year":"2015","unstructured":"Fabien Andr\u00e9, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2015. Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan. Proceedings of the VLDB Endowment 9, 4 (2015)."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2019.02.006"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378498"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583166"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2361319"},{"key":"e_1_2_1_9_1","volume-title":"International Workshop on AI-assisted Design for Architecture (AIDArc), held in conjunction with ISCA.","author":"Braun Peter","year":"2019","unstructured":"Peter Braun and Heiner Litz. 2019. Understanding memory access patterns for prefetching. In International Workshop on AI-assisted Design for Architecture (AIDArc), held in conjunction with ISCA."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939785"},{"key":"e_1_2_1_11_1","volume-title":"Effective hardware-based data prefetching for high-performance processors","author":"Chen Tien-Fu","year":"1995","unstructured":"Tien-Fu Chen and Jean-Loup Baer. 1995. Effective hardware-based data prefetching for high-performance processors. IEEE transactions on computers 44, 5 (1995), 609\u2013623."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/997817.997857"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963487"},{"key":"e_1_2_1_14_1","volume-title":"Learning Space Partitions for Nearest Neighbor Search. ICLR","author":"Dong Yihe","year":"2020","unstructured":"Yihe Dong, Piotr Indyk, Ilya P Razenshteyn, and Tal Wagner. 2020. Learning Space Partitions for Nearest Neighbor Search. ICLR (2020)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3067706"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213898"},{"key":"e_1_2_1_19_1","volume-title":"Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search. arXiv preprint arXiv:2409.09913","author":"Gao Jianyang","year":"2024","unstructured":"Jianyang Gao, Yutong Gou, Yuexuan Xu, Yongyi Yang, Cheng Long, and Raymond Chi-Wing Wong. 2024. Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search. arXiv preprint arXiv:2409.09913 (2024)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589282"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654970"},{"key":"e_1_2_1_22_1","unstructured":"Yunfan Gao Yun Xiong Xinyu Gao Kangxiang Jia Jinliu Pan Yuxi Bi Yi Dai Jiawei Sun Meng Wang and Haofen Wang. 2024. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv:2312.10997 [cs.CL]"},{"key":"e_1_2_1_23_1","first-page":"518","article-title":"Similarity search in high dimensions via hashing","volume":"99","author":"Gionis Aristides","year":"1999","unstructured":"Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al. 1999. Similarity search in high dimensions via hashing. In Vldb, Vol. 99. 518\u2013529.","journal-title":"Vldb"},{"key":"e_1_2_1_24_1","unstructured":"Google. 2025. Search Engine. https:\/\/www.google.com\/."},{"key":"e_1_2_1_25_1","first-page":"12","article-title":"Manu: a cloud native vector database management system","volume":"15","author":"Guo Rentong","year":"2022","unstructured":"Rentong Guo, Xiaofan Luan, Long Xiang, Xiao Yan, Xiaomeng Yi, Jigao Luo, Qianya Cheng, Weizhi Xu, Jiarui Luo, Frank Liu, Zhenshan Cao, Yanliang Qiao, Ting Wang, Bo Tang, and Charles Xie. 2022. Manu: a cloud native vector database management system. Proc. VLDB Endow. 15, 12 (Aug. 2022), 3548\u20133561.","journal-title":"Proc. VLDB Endow."},{"key":"e_1_2_1_26_1","volume-title":"International Conference on Machine Learning. PMLR, 3887\u20133896","author":"Guo Ruiqi","year":"2020","unstructured":"Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. In International Conference on Machine Learning. PMLR, 3887\u20133896."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_2_1_28_1","unstructured":"Intel. 2025. AVX512. https:\/\/www.intel.com\/content\/www\/us\/en\/products\/docs\/accelerator-engines\/what-is-intel-avx-512.html."},{"key":"e_1_2_1_29_1","volume-title":"Ravishankar Krishnawamy, and Rohan Kadekodi.","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems 32 (2019)."},{"key":"e_1_2_1_30_1","volume-title":"Ravishankar Krishnawamy, and Rohan Kadekodi.","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in neural information processing Systems 32 (2019)."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_2_1_33_1","unstructured":"Kaixuan Ji Guanlin Liu Ning Dai Qingping Yang Renjie Zheng Zheng Wu Chen Dun Quanquan Gu and Lin Yan. 2025. Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization. arXiv:2410.09302 [cs.LG]"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_2_1_36_1","first-page":"9459","article-title":"Retrieval-augmented generation for knowledge-intensive nlp tasks","volume":"33","author":"Lewis Patrick","year":"2020","unstructured":"Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen-tau Yih, Tim Rockt\u00e4schel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459\u20139474.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2019.2909204"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.14778\/3489496.3489506"},{"key":"e_1_2_1_39_1","unstructured":"Yu A Malkov and Dmitry A Yashunin. [n.d.]. hnswlib. https:\/\/github.com\/nmslib\/hnswlib."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2889473"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2889473"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588908"},{"key":"e_1_2_1_43_1","volume-title":"Proc. ACM Manag. Data 1, 1, Article 54 (May","author":"Peng Yun","year":"2023","unstructured":"Yun Peng, Byron Choi, Tsz Nam Chan, Jianye Yang, and Jianliang Xu. 2023. Efficient Approximate Nearest Neighbor Search in Multi-dimensional Databases. Proc. ACM Manag. Data 1, 1, Article 54 (May 2023), 27 pages."},{"key":"e_1_2_1_44_1","volume-title":"Locality-sensitive binary codes from shift-invariant kernels. Advances in neural information processing systems 22","author":"Raginsky Maxim","year":"2009","unstructured":"Maxim Raginsky and Svetlana Lazebnik. 2009. Locality-sensitive binary codes from shift-invariant kernels. Advances in neural information processing systems 22 (2009)."},{"key":"e_1_2_1_45_1","volume-title":"CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation. arXiv:2502.21074 [cs.CL]","author":"Shen Zhenyi","year":"2025","unstructured":"Zhenyi Shen, Hanqi Yan, Linhai Zhang, Zhanghao Hu, Yali Du, and Yulan He. 2025. CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation. arXiv:2502.21074 [cs.CL]"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735461.2735462"},{"key":"e_1_2_1_47_1","unstructured":"Taobao. 2025. Product Search. https:\/\/www.taobao.com\/."},{"key":"e_1_2_1_48_1","unstructured":"Vald. 2021. Vald. https:\/\/github.com\/vdaas\/vald."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/358923.358939"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/305138.305188"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457550"},{"key":"e_1_2_1_52_1","volume-title":"Heng Tao Shen, et al","author":"Wang Jingdong","year":"2017","unstructured":"Jingdong Wang, Ting Zhang, Nicu Sebe, Heng Tao Shen, et al. 2017. A survey on learning to hash. IEEE transactions on pattern analysis and machine intelligence 40, 4 (2017), 769\u2013790."},{"key":"e_1_2_1_53_1","volume-title":"Effective and General Distance Computation for Approximate Nearest Neighbor Search. arXiv preprint arXiv:2404.16322","author":"Yang Mingyu","year":"2024","unstructured":"Mingyu Yang, Wentao Li, Jiabao Jin, Xiaoyao Zhong, Xiangyu Wang, Zhitao Shen, Wei Jia, and Wei Wang. 2024. Effective and General Distance Computation for Approximate Nearest Neighbor Search. arXiv preprint arXiv:2404.16322 (2024)."},{"key":"e_1_2_1_54_1","volume-title":"Fast High-dimensional Approximate Nearest Neighbor Search with Efficient Index Time and Space. arXiv preprint arXiv:2411.06158","author":"Yang Mingyu","year":"2024","unstructured":"Mingyu Yang, Wentao Li, and Wei Wang. 2024. Fast High-dimensional Approximate Nearest Neighbor Search with Efficient Index Time and Space. arXiv preprint arXiv:2411.06158 (2024)."},{"key":"e_1_2_1_55_1","volume-title":"Xiyue Gao, Qianru Wang, Yanguo Peng, and Jiangtao Cui.","author":"Yang Shuo","year":"2025","unstructured":"Shuo Yang, Jiadong Xie, Yingfan Liu, Jeffrey Xu Yu, Xiyue Gao, Qianru Wang, Yanguo Peng, and Jiangtao Cui. 2025. Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search. arXiv:2410.01231 [cs.DB]"},{"key":"e_1_2_1_56_1","volume-title":"VDTuner: Automated Performance Tuning for Vector Data Management Systems. In 2024 IEEE 40th International Conference on Data Engineering (ICDE). 4357\u20134369","author":"Yang Tiannuo","year":"2024","unstructured":"Tiannuo Yang, Wen Hu, Wangqi Peng, Yusen Li, Jianguo Li, Gang Wang, and Xiaoguang Liu. 2024. VDTuner: Automated Performance Tuning for Vector Data Management Systems. In 2024 IEEE 40th International Conference on Data Engineering (ICDE). 4357\u20134369."},{"key":"e_1_2_1_57_1","unstructured":"YouTube. 2025. Video Search. https:\/\/www.youtube.com\/."},{"key":"e_1_2_1_58_1","unstructured":"Kongcheng Zhang Qi Yao Baisheng Lai Jiaxing Huang Wenkai Fang Dacheng Tao Mingli Song and Shunyu Liu. 2025. Reasoning with Reinforced Functional Token Tuning. arXiv:2502.13389 [cs.AI]"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.14778\/3377369.3377374"},{"key":"e_1_2_1_60_1","volume-title":"Jingkuan Song, and Peng Cheng.","author":"Zhong Xiaoyao","year":"2025","unstructured":"Xiaoyao Zhong, Haotian Li, Jiabao Jin, Mingyu Yang, Deming Chu, Xiangyu Wang, Zhitao Shen, Wei Jia, George Gu, Yi Xie, Xuemin Lin, Heng Tao Shen, Jingkuan Song, and Peng Cheng. 2025. VSAG: An Optimized Search Framework for Graph-based Approximate Nearest Neighbor Search [technical report]. (2025). arXiv:2503.17911 [cs.DB]"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2393347.2393377"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/4235.797969"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3750601.3750624","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:41:54Z","timestamp":1758030114000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3750601.3750624"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8]]},"references-count":62,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["10.14778\/3750601.3750624"],"URL":"https:\/\/doi.org\/10.14778\/3750601.3750624","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,8]]},"assertion":[{"value":"2025-09-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}