{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T03:49:03Z","timestamp":1768103343955,"version":"3.49.0"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62302421"],"award-info":[{"award-number":["62302421"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Basic and Applied Basic Research Fund in Guangdong Province","award":["2023A1515011280, 2025A1515010439"],"award-info":[{"award-number":["2023A1515011280, 2025A1515010439"]}]},{"DOI":"10.13039\/100018735","name":"Ant Group","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100018735","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Guangdong Provincial Key Laboratory of Big Data Computing, The Chinese University of Hong Kong, Shenzhen"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,12,4]]},"abstract":"<jats:p>With the growing integration of structured and unstructured data, new methods have emerged for performing similarity searches on vectors while honoring structured attribute constraints, i.e., a process known as Filtering Approximate Nearest Neighbor (Filtering ANN) search. Since many of these algorithms have only appeared in recent years and are designed to work with a variety of base indexing methods and filtering strategies, there is a pressing need for a unified analysis that identifies their core techniques and enables meaningful comparisons.<\/jats:p>\n                  <jats:p>In this work, we present a unified Filtering ANN search interface that encompasses the latest algorithms and evaluate them extensively from multiple perspectives. First, we propose a comprehensive taxonomy of existing Filtering ANN algorithms based on attribute types and filtering strategies. Next, we analyze their key components, i.e., index structures, pruning strategies, and entry point selection, to elucidate design differences and tradeoffs. We then conduct a broad experimental evaluation on 10 algorithms and 12 methods across 4 datasets (each with up to 10 million items), incorporating both synthetic and real attributes and covering selectivity levels from 0.1% to 100%. Finally, an in-depth component analysis reveals the influence of pruning, entry point selection, and edge filtering costs on overall performance. Based on our findings, we summarize the strengths and limitations of each approach, provide practical guidelines for selecting appropriate methods, and suggest promising directions for future research. Our code is available at: https:\/\/github.com\/lmccccc\/FANNBench.<\/jats:p>","DOI":"10.1145\/3769763","type":"journal-article","created":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T04:32:13Z","timestamp":1764995533000},"page":"1-26","source":"Crossref","is-referenced-by-count":0,"title":["Attribute Filtering in Approximate Nearest Neighbor Search: An In-depth Experimental Study"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5524-1042","authenticated-orcid":false,"given":"Mocheng","family":"Li","sequence":"first","affiliation":[{"name":"The Chinese University of Hong Kong, Shenzhen, Shenzhen, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2122-915X","authenticated-orcid":false,"given":"Xiao","family":"Yan","sequence":"additional","affiliation":[{"name":"Wuhan University, Institute for Math &amp; AI, Wuhan, Wuhan, Hubei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0230-1048","authenticated-orcid":false,"given":"Baotong","family":"Lu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-5199-7799","authenticated-orcid":false,"given":"Yue","family":"Zhang","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Shenzhen, Shenzhen, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6313-6288","authenticated-orcid":false,"given":"James","family":"Cheng","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3243-8512","authenticated-orcid":false,"given":"Chenhao","family":"Ma","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Shenzhen, Shenzhen, Guangdong, China"}]}],"member":"320","published-online":{"date-parts":[[2025,12,5]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"1996. PostgreSQL. https:\/\/www.postgresql.org\/"},{"key":"e_1_2_1_2_1","unstructured":"2016. Yahoo. Nearest neighbor search with neighborhood graph and tree for high-dimensional data. https:\/\/github. com\/yahoojapan\/NGT"},{"key":"e_1_2_1_3_1","unstructured":"2019. Weaviate:Vector database for contextual queries. https:\/\/github.com\/semi-technologies\/weaviate"},{"key":"e_1_2_1_4_1","unstructured":"2020. Sptag: A library for fast approximate nearest neighbor search. https:\/\/github.com\/microsoft\/SPTAG"},{"key":"e_1_2_1_5_1","unstructured":"2020. Vearch: A Distributed System for Embedding-based. https:\/\/github.com\/vearch\/vearch"},{"key":"e_1_2_1_6_1","unstructured":"2021. pgvector. https:\/\/github.com\/pgvector\/pgvector"},{"key":"e_1_2_1_7_1","unstructured":"2021. Pinecone. https:\/\/www.pinecone.io\/"},{"key":"e_1_2_1_8_1","unstructured":"2021. qdrant. http:\/\/qdrant.tech\/"},{"key":"e_1_2_1_9_1","unstructured":"2025. The technical report is available in our repository. https:\/\/github.com\/lmccccc\/FANNBench"},{"key":"e_1_2_1_10_1","volume-title":"Hd-index: Pushing the scalability-accuracy boundary for approximate knn search in high-dimensional spaces. arXiv preprint arXiv:1804.06829","author":"Arora Akhil","year":"2018","unstructured":"Akhil Arora, Sakshi Sinha, Piyush Kumar, and Arnab Bhattacharya. 2018. Hd-index: Pushing the scalability-accuracy boundary for approximate knn search in high-dimensional spaces. arXiv preprint arXiv:1804.06829 (2018)."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2019.02.006"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1142\/8685"},{"key":"e_1_2_1_13_1","unstructured":"BigANN Benchmark. 2021. Billion-Scale Approximate Nearest Neighbor Search Challenge: NeurIPS'21 competition track."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"e_1_2_1_15_1","first-page":"873","article-title":"Vs-quant: Pervector scaled quantization for accurate low-precision neural network inference","volume":"3","author":"Dai Steve","year":"2021","unstructured":"Steve Dai, Rangha Venkatesan, Mark Ren, Brian Zimmer, William Dally, and Brucek Khailany. 2021. Vs-quant: Pervector scaled quantization for accurate low-precision neural network inference. Proceedings of Machine Learning and Systems 3 (2021), 873-884.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/997817.997857"},{"key":"e_1_2_1_17_1","volume-title":"Redcaps: Web-curated image-text data created by the people, for the people. arXiv preprint arXiv:2111.11431","author":"Desai Karan","year":"2021","unstructured":"Karan Desai, Gaurav Kaul, Zubin Aysola, and Justin Johnson. 2021. Redcaps: Web-curated image-text data created by the people, for the people. arXiv preprint arXiv:2111.11431 (2021)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963487"},{"key":"e_1_2_1_19_1","volume-title":"The faiss library. arXiv preprint arXiv:2401.08281","author":"Douze Matthijs","year":"2024","unstructured":"Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazar\u00e9, Maria Lomeli, Lucas Hosseini, and Herv\u00e9 J\u00e9gou. 2024. The faiss library. arXiv preprint arXiv:2401.08281 (2024)."},{"key":"e_1_2_1_20_1","volume-title":"From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130","author":"Edge Darren","year":"2024","unstructured":"Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, and Jonathan Larson. 2024. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130 (2024)."},{"key":"e_1_2_1_21_1","volume-title":"Approximate Nearest Neighbor Search with Window Filters. arXiv preprint arXiv:2402.00943","author":"Engels Joshua","year":"2024","unstructured":"Joshua Engels, Benjamin Landrum, Shangdi Yu, Laxman Dhulipala, and Julian Shun. 2024. Approximate Nearest Neighbor Search with Window Filters. arXiv preprint arXiv:2402.00943 (2024)."},{"key":"e_1_2_1_22_1","volume-title":"Efanna: An extremely fast approximate nearest neighbor search algorithm based on knn graph. arXiv preprint arXiv:1609.07228","author":"Fu Cong","year":"2016","unstructured":"Cong Fu and Deng Cai. 2016. Efanna: An extremely fast approximate nearest neighbor search algorithm based on knn graph. arXiv preprint arXiv:1609.07228 (2016)."},{"key":"e_1_2_1_23_1","volume-title":"Fast approximate nearest neighbor search with the navigating spreading-out graph. arXiv preprint arXiv:1707.00143","author":"Fu Cong","year":"2017","unstructured":"Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2017. Fast approximate nearest neighbor search with the navigating spreading-out graph. arXiv preprint arXiv:1707.00143 (2017)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654970"},{"key":"e_1_2_1_25_1","volume-title":"Optimized product quantization","author":"Ge Tiezheng","year":"2013","unstructured":"Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized product quantization. IEEE transactions on pattern analysis and machine intelligence 36, 4 (2013), 744-755."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583552"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/3397230.3397243"},{"key":"e_1_2_1_28_1","volume-title":"HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. arXiv preprint arXiv:2405.14831","author":"Guti\u00e9rrez Bernal Jim\u00e9nez","year":"2024","unstructured":"Bernal Jim\u00e9nez Guti\u00e9rrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. 2024. HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. arXiv preprint arXiv:2405.14831 (2024)."},{"key":"e_1_2_1_29_1","volume-title":"G-retriever: Retrieval-augmented generation for textual graph understanding and question answering. arXiv preprint arXiv:2402.07630","author":"He Xiaoxin","year":"2024","unstructured":"Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh V Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, and Bryan Hooi. 2024. G-retriever: Retrieval-augmented generation for textual graph understanding and question answering. arXiv preprint arXiv:2402.07630 (2024)."},{"key":"e_1_2_1_30_1","first-page":"102121","article-title":"LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search","volume":"37","author":"J\u00e4\u00e4saari Elias","year":"2024","unstructured":"Elias J\u00e4\u00e4saari, Ville Hyv\u00f6nen, and Teemu Roos. 2024. LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search. Advances in Neural Information Processing Systems 37 (2024), 102121-102153.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_31_1","volume-title":"Ravishankar Krishnawamy, and Rohan Kadekodi.","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems 32 (2019)."},{"key":"e_1_2_1_32_1","volume-title":"Product quantization for nearest neighbor search","author":"Jegou Herve","year":"2010","unstructured":"Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence 33, 1 (2010), 117-128."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2011.5946540"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/GC46384.2019.00018"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1090\/S0002-9939-1956-0078686-7"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3284028.3284030"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2019.2909204"},{"key":"e_1_2_1_38_1","volume-title":"UNIFY: Unified Index for Range Filtered Approximate Nearest Neighbors Search. arXiv preprint arXiv:2412.02448","author":"Liang Anqi","year":"2024","unstructured":"Anqi Liang, Pengcheng Zhang, Bin Yao, Zhongpu Chen, Yitong Song, and Guangxu Cheng. 2024. UNIFY: Unified Index for Range Filtered Approximate Nearest Neighbors Search. arXiv preprint arXiv:2412.02448 (2024)."},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability\/University of California Press.","author":"MacQueen J","year":"1967","unstructured":"J MacQueen. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability\/University of California Press."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2013.10.006"},{"key":"e_1_2_1_41_1","volume-title":"Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs","author":"Malkov Yu A","year":"2018","unstructured":"Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence 42, 4 (2018), 824-836."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240508.3240630"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-45442-5_34"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589777"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120832"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098108"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-024-00864-x"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.03.016"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/11575832_14"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654923"},{"key":"e_1_2_1_51_1","first-page":"5","volume-title":"Proceedings of KDD cup and workshop","volume":"2007","author":"Paterek Arkadiusz","year":"2007","unstructured":"Arkadiusz Paterek. 2007. Improving regularized singular value decomposition for collaborative filtering. In Proceedings of KDD cup and workshop, Vol. 2007. 5-8."},{"key":"e_1_2_1_52_1","unstructured":"Zhencan Peng Miao Qiao Wenchao Zhou Feifei Li and Dong Deng. [n.d.]. Dynamic Range-Filtering Approximate Nearest Neighbor Search. ([n.d.])."},{"key":"e_1_2_1_53_1","volume-title":"International conference on machine learning. PMLR, 8748-8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748-8763."},{"key":"e_1_2_1_54_1","volume-title":"Fast and Exact Similarity Search in less than a Blink of an Eye. arXiv preprint arXiv:2411.17483","author":"Sch\u00e4fer Patrick","year":"2024","unstructured":"Patrick Sch\u00e4fer, Jakob Brand, Ulf Leser, Botao Peng, and Themis Palpanas. 2024. Fast and Exact Similarity Search in less than a Blink of an Eye. arXiv preprint arXiv:2411.17483 (2024)."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2008.4587638"},{"key":"e_1_2_1_56_1","volume-title":"The relative neighbourhood graph of a finite planar set. Pattern recognition 12, 4","author":"Toussaint Godfried T","year":"1980","unstructured":"Godfried T Toussaint. 1980. The relative neighbourhood graph of a finite planar set. Pattern recognition 12, 4 (1980), 261-268."},{"key":"e_1_2_1_57_1","volume-title":"Attention is all you need. Advances in Neural Information Processing Systems","author":"Vaswani A","year":"2017","unstructured":"A Vaswani. 2017. Attention is all you need. Advances in Neural Information Processing Systems (2017)."},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the 2021 International Conference on Management of Data. 2614-2627","author":"Yi Xiaomeng","year":"2021","unstructured":"JianguoWang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, XiangyuWang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. 2021. Milvus: A purpose-built vector data management system. In Proceedings of the 2021 International Conference on Management of Data. 2614-2627."},{"key":"e_1_2_1_59_1","volume-title":"An efficient and robust framework for approximate nearest neighbor search with attribute constraint. Advances in Neural Information Processing Systems 36","author":"Wang Mengzhao","year":"2024","unstructured":"Mengzhao Wang, Lingwei Lv, Xiaoliang Xu, Yuxiang Wang, Qiang Yue, and Jiongkang Ni. 2024. An efficient and robust framework for approximate nearest neighbor search with attribute constraint. Advances in Neural Information Processing Systems 36 (2024)."},{"key":"e_1_2_1_60_1","volume-title":"A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search. arXiv preprint arXiv:2101.12631","author":"Wang Mengzhao","year":"2021","unstructured":"Mengzhao Wang, Xiaoliang Xu, Qiang Yue, and Yuxiang Wang. 2021. A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search. arXiv preprint arXiv:2101.12631 (2021)."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415541"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3698814"},{"key":"e_1_2_1_63_1","volume-title":"Xiyue Gao, Qianru Wang, Yanguo Peng, and Jiangtao Cui.","author":"Yang Shuo","year":"2024","unstructured":"Shuo Yang, Jiadong Xie, Yingfan Liu, Jeffrey Xu Yu, Xiyue Gao, Qianru Wang, Yanguo Peng, and Jiangtao Cui. 2024. Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search. arXiv preprint arXiv:2410.01231 (2024)."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3386131"},{"key":"e_1_2_1_65_1","first-page":"377","volume-title":"17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)","author":"Zhang Qianxi","year":"2023","unstructured":"Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, et al. 2023. {VBASE}: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). 377-395."},{"key":"e_1_2_1_66_1","volume-title":"Constrained approximate similarity search on proximity graph. arXiv preprint arXiv:2210.14958","author":"Zhao Weijie","year":"2022","unstructured":"Weijie Zhao, Shulong Tan, and Ping Li. 2022. Constrained approximate similarity search on proximity graph. arXiv preprint arXiv:2210.14958 (2022)."},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639324"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3769763","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T23:22:45Z","timestamp":1768087365000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3769763"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,4]]},"references-count":67,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12,4]]}},"alternative-id":["10.1145\/3769763"],"URL":"https:\/\/doi.org\/10.1145\/3769763","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,4]]}}}