{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:12:56Z","timestamp":1779174776496,"version":"3.51.4"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"9","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,5]]},"abstract":"<jats:p>Modern deep learning models capture the semantics of complex data by transforming them into high-dimensional embedding vectors. Emerging applications, such as retrieval-augmented generation, use approximate nearest neighbor (ANN) search in the embedding vector space to find similar data. Existing vector databases provide indexes for efficient ANN searches, with graph-based indexes being the most popular due to their low latency and high recall in real-world high-dimensional datasets. However, these indexes are costly to build, suffer from significant contention under concurrent read-write workloads, and scale poorly to multiple servers.<\/jats:p>\n          <jats:p>\n            Our goal is to build a vector database that achieves high throughput and high recall under concurrent read-write workloads. To this end, we first propose an ANN index with an explicit two-stage design combining a fast filter stage with highly compressed vectors and a refine stage to ensure recall, and we devise a novel lightweight machine learning technique to fine-tune the index parameters. We introduce an early termination check to dynamically adapt the search process for each query. Next, we add support for writes while maintaining search performance by decoupling the management of the learned parameters. Finally, we design HAKES, a distributed vector database that serves the new index in a disaggregated architecture. We evaluate our index and system against 12 state-of-the-art indexes and three distributed vector databases, using high-dimensional embedding datasets generated by deep learning models. The experimental results show that our index outperforms index baselines in the high recall region and under concurrent read-write workloads. Furthermore,\n            <jats:italic toggle=\"yes\">HAKES<\/jats:italic>\n            is scalable and achieves up to 16x higher throughputs than the baselines.\n          <\/jats:p>","DOI":"10.14778\/3746405.3746427","type":"journal-article","created":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T17:06:20Z","timestamp":1756919180000},"page":"3049-3062","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["<i>HAKES<\/i>\n            : Scalable Vector Database for Embedding Search Service"],"prefix":"10.14778","volume":"18","author":[{"given":"Guoyu","family":"Hu","sequence":"first","affiliation":[{"name":"National University of Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shaofeng","family":"Cai","sequence":"additional","affiliation":[{"name":"National University of Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tien Tuan Anh","family":"Dinh","sequence":"additional","affiliation":[{"name":"Deakin University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhongle","family":"Xie","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cong","family":"Yue","sequence":"additional","affiliation":[{"name":"National University of Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gang","family":"Chen","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Beng Chin","family":"Ooi","sequence":"additional","affiliation":[{"name":"Zhejiang University and National University of Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,9,3]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611479.3611537"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.14778\/2856318.2856324"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2019.02.006"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583166"},{"key":"e_1_2_1_5_1","volume-title":"Additive Quantization for Extreme Vector Compression. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 931\u2013938","author":"Babenko Artem","year":"2014","unstructured":"Artem Babenko and Victor Lempitsky. 2014. Additive Quantization for Extreme Vector Compression. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 931\u2013938."},{"key":"e_1_2_1_6_1","volume-title":"SPANN: Highly-Efficient Billion-Scale Approximate Nearest Neighborhood Search. In Advances in Neural Information Processing Systems. 5199\u20135212.","author":"Chen Qi","year":"2021","unstructured":"Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly-Efficient Billion-Scale Approximate Nearest Neighborhood Search. In Advances in Neural Information Processing Systems. 5199\u20135212."},{"key":"e_1_2_1_7_1","volume-title":"Pengi: An Audio Language Model for Audio Tasks. In Advances in Neural Information Processing Systems. 18090\u201318108.","author":"Deshmukh Soham","year":"2023","unstructured":"Soham Deshmukh, Benjamin Elizalde, Rita Singh, and Huaming Wang. 2023. Pengi: An Audio Language Model for Audio Tasks. In Advances in Neural Information Processing Systems. 18090\u201318108."},{"key":"e_1_2_1_8_1","volume-title":"Learning Space Partitions for Nearest Neighbor Search. In 8th International Conference on Learning Representations.","author":"Dong Yihe","year":"2020","unstructured":"Yihe Dong, Piotr Indyk, Ilya P. Razenshteyn, and Tal Wagner. 2020. Learning Space Partitions for Nearest Neighbor Search. In 8th International Conference on Learning Representations."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.14778\/3503585.3503594"},{"key":"e_1_2_1_10_1","volume-title":"The Faiss Library. CoRR abs\/2401.08281","author":"Douze Matthijs","year":"2024","unstructured":"Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazar\u00e9, Maria Lomeli, Lucas Hosseini, and Herv\u00e9 J\u00e9gou. 2024. The Faiss Library. CoRR abs\/2401.08281 (2024)."},{"key":"e_1_2_1_11_1","volume-title":"Retrieved","author":"Ellis Jonathon","year":"2024","unstructured":"Jonathon Ellis. 2024. JVector. Retrieved April 12, 2024 from https:\/\/github.com\/jbellis\/jvector"},{"key":"e_1_2_1_12_1","volume-title":"Retrieved","author":"Face Hugging","year":"2024","unstructured":"Hugging Face. 2024. KShivendu\/dbpedia-entities-openai-1M. Retrieved April 12, 2024 from https:\/\/huggingface.co\/datasets\/KShivendu\/dbpedia-entities-openai-1M"},{"key":"e_1_2_1_13_1","volume-title":"Retrieved","author":"Software Foundation The Apache","year":"2024","unstructured":"The Apache Software Foundation. 2024. Apache Cassandra\u00ae 5.0: Moving Toward an AI-Driven Future. Retrieved April 12, 2024 from https:\/\/cassandra.apache.org\/_\/Apache-Cassandra-5.0-Moving-Toward-an-AI-Driven-Future.html"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3067706"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783284"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589282"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654970"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.240"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. 482\u2013490","author":"Guo Ruiqi","year":"2016","unstructured":"Ruiqi Guo, Sanjiv Kumar, Krzysztof Choromanski, and David Simcha. 2016. Quantization Based Fast Inner Product Search. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. 482\u2013490."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554843"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning. 3887\u20133896","author":"Guo Ruiqi","year":"2020","unstructured":"Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating Large-Scale Inference with Anisotropic Vector Quantization. In Proceedings of the 37th International Conference on Machine Learning. 3887\u20133896."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 486\u2013495","author":"Gupta Gaurav","unstructured":"Gaurav Gupta, Tharun Medini, Anshumali Shrivastava, and Alexander J. Smola. 2022. BLISS: A Billion Scale Index Using Iterative Re-Partitioning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 486\u2013495."},{"key":"e_1_2_1_24_1","volume-title":"Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. 770\u2013778","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. 770\u2013778."},{"key":"e_1_2_1_25_1","volume-title":"MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs\/1704.04861","author":"Howard Andrew G.","year":"2017","unstructured":"Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs\/1704.04861 (2017)."},{"key":"e_1_2_1_26_1","volume-title":"Zhongle Xie, Cong Yue, Gang Chen, and Beng Chin Ooi.","author":"Hu Guoyu","year":"2025","unstructured":"Guoyu Hu, Shaofeng Cai, Tien Tuan Anh Dinh, Zhongle Xie, Cong Yue, Gang Chen, and Beng Chin Ooi. 2025. HAKES: Scalable Vector Database for Embedding Search Service (Extended Version). https:\/\/github.com\/nusdbsystem\/HAKES-Search\/tree\/main\/extended-version"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1071610.1071612"},{"key":"e_1_2_1_29_1","volume-title":"Ravishankar Krishnawamy, and Rohan Kadekodi.","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. DiskANN: Fast Accurate Billion-Point Nearest Neighbor Search on a Single Node. In Advances in Neural Information Processing Systems. 13766\u201313776."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_2_1_31_1","volume-title":"Searching in One Billion Vectors: Re-Rank with Source Coding. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing. 861\u2013864","author":"J\u00e9gou Herv\u00e9","year":"2011","unstructured":"Herv\u00e9 J\u00e9gou, Romain Tavenard, Matthijs Douze, and Laurent Amsaleg. 2011. Searching in One Billion Vectors: Re-Rank with Source Coding. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing. 861\u2013864."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1773912.1773922"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2589\u20132599","author":"Lei Yifan","unstructured":"Yifan Lei, Qiang Huang, Mohan Kankanhalli, and Anthony K. H. Tung. 2020. Locality-Sensitive Hashing Scheme Based on Longest Circular Co-substring. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2589\u20132599."},{"key":"e_1_2_1_35_1","unstructured":"Patrick Lewis Ethan Perez Aleksandra Piktus Fabio Petroni Vladimir Karpukhin Naman Goyal Heinrich Kuttler Mike Lewis Wen-tau Yih Tim Rocktaschel Sebastian Riedel and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems. 9459\u20139474."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3380600"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599406"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2019.2909204"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 42nd International Conference on Machine Learning.","author":"Lin Tianwei","year":"2025","unstructured":"Tianwei Lin, Wenqiao Zhang, Sijing Li, Yuqian Yuan, Binhe Yu, Haoyuan Li, Wanggui He, Hao Jiang, Mengze Li, Song xiaohui, Siliang Tang, Jun Xiao, Hui Lin, Yueting Zhuang, and Beng Chin Ooi. 2025. HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation. In Proceedings of the 42nd International Conference on Machine Learning."},{"key":"e_1_2_1_40_1","volume-title":"Image Retrieval on Real-Life Images with Pre-trained Vision-and-Language Models. In 2021 IEEE\/CVF International Conference on Computer Vision. 2105\u20132114","author":"Liu Zheyuan","year":"2021","unstructured":"Zheyuan Liu, Cristian Rodriguez-Opazo, Damien Teney, and Stephen Gould. 2021. Image Retrieval on Real-Life Images with Pre-trained Vision-and-Language Models. In 2021 IEEE\/CVF International Conference on Computer Vision. 2105\u20132114."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583482"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2889473"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330759"},{"key":"e_1_2_1_44_1","volume-title":"Article 200901","author":"Ooi Beng Chin","year":"2024","unstructured":"Beng Chin Ooi, Shaofeng Cai, Gang Chen, Yanyan Shen, Kian-Lee Tan, Yuncheng Wu, Xiaokui Xiao, Naili Xing, Cong Yue, Lingze Zeng, Meihui Zhang, and Zhanhao Zhao. 2024. NeurDB: an AI-powered autonomous data system. Sci. China Inf. Sci. 67, 10, Article 200901 (2024), 10 pages."},{"key":"e_1_2_1_45_1","unstructured":"Ninh Pham and Tao Liu. 2024. Falconn++: A Locality-Sensitive Filtering Approach for Approximate Nearest Neighbor Search. In Advances in Neural Information Processing Systems. 31186\u201331198."},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning. 8748\u20138763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning. 8748\u20138763."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_1_48_1","volume-title":"Spreading Vectors for Similarity Search. In 7th International Conference on Learning Representations.","author":"Sablayrolles Alexandre","year":"2019","unstructured":"Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, and Herv\u00e9 J\u00e9gou. 2019. Spreading Vectors for Similarity Search. In 7th International Conference on Learning Representations."},{"key":"e_1_2_1_49_1","volume-title":"SOAR: Improved Indexing for Approximate Nearest Neighbor Search. In Advances in Neural Information Processing Systems. 3189\u20133204.","author":"Sun Philip","year":"2023","unstructured":"Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, and Sanjiv Kumar. 2023. SOAR: Improved Indexing for Approximate Nearest Neighbor Search. In Advances in Neural Information Processing Systems. 3189\u20133204."},{"key":"e_1_2_1_50_1","volume-title":"BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).","author":"Thakur Nandan","year":"2021","unstructured":"Nandan Thakur, Nils Reimers, Andreas R\u00fcckl\u00e9, Abhishek Srivastava, and Iryna Gurevych. 2021. BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457550"},{"key":"e_1_2_1_52_1","volume-title":"Text Embeddings by Weakly-Supervised Contrastive Pre-training. CoRR abs\/2212.03533","author":"Wang Liang","year":"2022","unstructured":"Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2022. Text Embeddings by Weakly-Supervised Contrastive Pre-training. CoRR abs\/2212.03533 (2022)."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639269"},{"key":"e_1_2_1_54_1","volume-title":"Retrieved","year":"2024","unstructured":"Weaviate. 2024. Weaviate. Retrieved April 12, 2024 from https:\/\/weaviate.io\/"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415541"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531799"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3600006.3613166"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3488560.3498443"},{"key":"e_1_2_1_59_1","unstructured":"Hailin Zhang Yujing Wang Qi Chen Ruiheng Chang Ting Zhang Ziming Miao Yingyan Hou Yang Ding Xupeng Miao Haonan Wang Bochen Pang Yuefeng Zhan Hao Sun Weiwei Deng Qi Zhang Fan Yang Xing Xie Mao Yang and Bin Cui. 2024. Model-Enhanced Vector Index. In Advances in Neural Information Processing Systems. 54903\u201354917."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-022-00762-0"},{"key":"e_1_2_1_61_1","volume-title":"17th USENIX Symposium on Operating Systems Design and Implementation. 377\u2013395","author":"Zhang Qianxi","year":"2023","unstructured":"Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, Mao Yang, and Lidong Zhou. 2023. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In 17th USENIX Symposium on Operating Systems Design and Implementation. 377\u2013395."},{"key":"e_1_2_1_62_1","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1479\u20131488","author":"Zhang Yang","year":"2020","unstructured":"Yang Zhang, Fuli Feng, Chenxu Wang, Xiangnan He, Meng Wang, Yan Li, and Yongdong Zhang. 2020. How to Retrain Recommender System? A Sequential Meta-Learning Method. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1479\u20131488."},{"key":"e_1_2_1_63_1","volume-title":"Approximate Vector Queries on Very Large Unstructured Datasets. In 20th USENIX Symposium on Networked Systems Design and Implementation. 995\u20131011","author":"Zhang Zili","year":"2023","unstructured":"Zili Zhang, Chao Jin, Linpeng Tang, Xuanzhe Liu, and Xin Jin. 2023. Fast, Approximate Vector Queries on Very Large Unstructured Datasets. In 20th USENIX Symposium on Networked Systems Design and Implementation. 995\u20131011."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.14778\/3594512.3594527"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3746405.3746427","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T19:50:11Z","timestamp":1757015411000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3746405.3746427"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5]]},"references-count":64,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,5]]}},"alternative-id":["10.14778\/3746405.3746427"],"URL":"https:\/\/doi.org\/10.14778\/3746405.3746427","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,5]]},"assertion":[{"value":"2025-09-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}