{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,26]],"date-time":"2026-06-26T10:19:46Z","timestamp":1782469186111,"version":"3.54.5"},"reference-count":109,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2021,7]]},"abstract":"<jats:p>Approximate nearest neighbor search (ANNS) constitutes an important operation in a multitude of applications, including recommendation systems, information retrieval, and pattern recognition. In the past decade, graph-based ANNS algorithms have been the leading paradigm in this domain, with dozens of graph-based ANNS algorithms proposed. Such algorithms aim to provide effective, efficient solutions for retrieving the nearest neighbors for a given query. Nevertheless, these efforts focus on developing and optimizing algorithms with different approaches, so there is a real need for a comprehensive survey about the approaches' relative performance, strengths, and pitfalls. Thus here we provide a thorough comparative analysis and experimental evaluation of 13 representative graph-based ANNS algorithms via a new taxonomy and fine-grained pipeline. We compared each algorithm in a uniform test environment on eight real-world datasets and 12 synthetic datasets with varying sizes and characteristics. Our study yields novel discoveries, offerings several useful principles to improve algorithms, thus designing an optimized method that outperforms the state-of-the-art algorithms. This effort also helped us pinpoint algorithms' working portions, along with rule-of-thumb recommendations about promising research directions and suitable algorithms for practitioners in different fields.<\/jats:p>","DOI":"10.14778\/3476249.3476255","type":"journal-article","created":{"date-parts":[[2021,10,27]],"date-time":"2021-10-27T16:46:23Z","timestamp":1635353183000},"page":"1964-1978","source":"Crossref","is-referenced-by-count":221,"title":["A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search"],"prefix":"10.14778","volume":"14","author":[{"given":"Mengzhao","family":"Wang","sequence":"first","affiliation":[{"name":"Hangzhou Dianzi University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaoliang","family":"Xu","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qiang","family":"Yue","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuxiang","family":"Wang","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2021,10,27]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Retrieved","year":"2010"},{"key":"e_1_2_1_2_1","volume-title":"Retrieved","year":"2011"},{"key":"e_1_2_1_3_1","volume-title":"unknown. Common Crawl. Retrieved","year":"2020"},{"key":"e_1_2_1_4_1","volume-title":"unknown. TIMIT Audio. Retrieved","year":"2020"},{"key":"e_1_2_1_5_1","volume-title":"unknown. UQ-Vedio. Retrieved","year":"2019"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639328"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020576"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3204028.3204034"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/293347.293348"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-68474-1_3"},{"key":"e_1_2_1_11_1","unstructured":"Martin Aum\u00fcller and Matteo Ceccarello. 2019. Benchmarking Nearest Neighbor Search: Influence of Local Intrinsic Dimensionality and Result Diversity in Real-World Datasets.. In EDML@ SDM. 14--23.  Martin Aum\u00fcller and Matteo Ceccarello. 2019. Benchmarking Nearest Neighbor Search: Influence of Local Intrinsic Dimensionality and Result Diversity in Real-World Datasets.. In EDML@ SDM. 14--23."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/116873.116880"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning","volume":"97","author":"Baranchuk Dmitry","year":"2019"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2007.370210"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/1782394.1782417"},{"key":"e_1_2_1_16_1","volume-title":"Retrieved","author":"Bernhardsson Erik","year":"2017"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1038\/nphys1130"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218195912500112"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2016.7498244"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3227609.3227643"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2781360"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/502807.502808"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/1577069.1755852"},{"key":"e_1_2_1_24_1","volume-title":"SPTAG: A library for fast approximate nearest neighbor search. https:\/\/github.com\/Microsoft\/SPTAG","author":"Chen Qi","year":"2018"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022664626993"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"e_1_2_1_27_1","volume-title":"Pyramid: A General Framework for Distributed Similarity Search on Large-scale Datasets. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1066--1071","author":"Deng Shiyuan","year":"2019"},{"key":"e_1_2_1_28_1","volume-title":"Retrieved","author":"Dong Wei","year":"2011"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963487"},{"key":"e_1_2_1_30_1","volume-title":"Link and Code: Fast Indexing With Graphs and Compact Regression Codes. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018","author":"Douze Matthijs","year":"2018"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.410146"},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","unstructured":"Steven Fortune. 1995. Voronoi diagrams and Delaunay triangulations. In Computing in Euclidean geometry. World Scientific 225--265.  Steven Fortune. 1995. Voronoi diagrams and Delaunay triangulations. In Computing in Euclidean geometry. World Scientific 225--265.","DOI":"10.1142\/9789812831699_0007"},{"key":"e_1_2_1_33_1","volume-title":"Efanna: An extremely fast approximate nearest neighbor search algorithm based on knn graph. arXiv preprint arXiv:1609.07228","author":"Fu Cong","year":"2016"},{"key":"e_1_2_1_34_1","volume-title":"High Dimensional Similarity Search with Satellite System Graph: Efficiency, Scalability, and Unindexed Query Compatibility","author":"Fu Cong","year":"2021"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.379"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/3397230.3397243"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10844-009-0081-z"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/2283516.2283615"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.616"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-017-0472-7"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850469.2850470"},{"key":"e_1_2_1_43_1","volume-title":"Neighborhood Graph and Tree for Indexing High-dimensional Data","author":"Iwasaki Masajiro","year":"2020"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46759-7_2"},{"key":"e_1_2_1_45_1","volume-title":"Optimization of indexing based on k-nearest neighbor graph for proximity search in high-dimensional data. arXiv preprint arXiv:1810.07355","author":"Iwasaki Masajiro","year":"2018"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1016\/0166-218X(91)90069-9"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.163414"},{"key":"e_1_2_1_48_1","volume-title":"GloVe: Global Vectors for Word Representation. Retrieved","author":"Jeffrey Pennington","year":"2020"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_2_1_50_1","volume-title":"HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory. In 34th Conference on Neural Information Processing Systems (NeurIPS","author":"Ren Jie","year":"2020"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2014.2302018"},{"key":"e_1_2_1_52_1","volume-title":"Retrieved","author":"Johnson Jeff","year":"2017"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/335305.335325"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/GC46384.2019.00018"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1090\/S0002-9939-1956-0078686-7"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3380600"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2019.2909204"},{"key":"e_1_2_1_58_1","volume-title":"Graph based Nearest Neighbor Search: Promises and Failures. arXiv preprint arXiv:1904.02077","author":"Lin Peng-Cheng","year":"2019"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-30645-8_49"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2013.10.006"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0158162"},{"key":"e_1_2_1_62_1","volume-title":"Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs","author":"Malkov Yury A","year":"2018"},{"key":"e_1_2_1_63_1","volume-title":"Business and Consumer Analytics: New Ideas","author":"Mathieson Luke"},{"key":"e_1_2_1_64_1","volume-title":"PMD: An Optimal Transportation-Based User Distance for Recommender Systems. In European Conference on Information Retrieval. Springer, 272--280","author":"Meng Yitong","year":"2020"},{"key":"e_1_2_1_65_1","volume-title":"Retrieved","year":"2019"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.106970"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824059"},{"key":"e_1_2_1_68_1","volume-title":"Product quantization with dual codebooks for approximate nearest neighbor search. Neurocomputing","author":"Pan Zhibin","year":"2020"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1007\/11575832_14"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1007\/11764298_8"},{"key":"e_1_2_1_71_1","volume-title":"Comparative analysis of data structures for approximate nearest neighbor search. Data analytics","author":"Ponomarenko Alexander","year":"2014"},{"key":"e_1_2_1_72_1","volume-title":"International Conference on Machine Learning. PMLR, 7803--7813","author":"Prokhorenkova Liudmila","year":"2020"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10559-018-0034-z"},{"key":"e_1_2_1_75_1","volume-title":"Retrieved","author":"Russell Stewart","year":"2015"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/371920.372071"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/SFCS.1975.8"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-27615-7_8"},{"key":"e_1_2_1_79_1","volume-title":"Marcos R Vieira, and Daniel S Kaster.","author":"Shimomura Larissa C","year":"2020"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2008.4587638"},{"key":"e_1_2_1_81_1","volume-title":"3rd International Conference on Learning Representations, ICLR","author":"Simonyan Karen","year":"2015"},{"key":"e_1_2_1_82_1","volume-title":"NeurIPS","author":"Subramanya Suhas Jayaram","year":"2019"},{"key":"e_1_2_1_83_1","volume-title":"A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs. Pattern Analysis and Applications","author":"Tellez Eric S","year":"2021"},{"key":"e_1_2_1_84_1","volume-title":"Proceedings of the 34 th Symposium on the INTERFACE. Citeseer.","author":"Toussaint Godfried","year":"2002"},{"key":"e_1_2_1_85_1","volume-title":"The relative neighbourhood graph of a finite planar set. Pattern recognition 12, 4","author":"Toussaint Godfried T","year":"1980"},{"key":"e_1_2_1_86_1","volume-title":"Efficient construction of neighborhood graphs by the multiple sorting method. arXiv preprint arXiv:0904.3151","author":"Uno Takeaki","year":"2009"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/3323873.3325014"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.5555\/1795114.1795180"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.5555\/1018427.1020489"},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDMW.2013.50"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1145\/2393347.2393378"},{"key":"e_1_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2015.2487976"},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.5555\/2354409.2354854"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.125"},{"key":"e_1_2_1_95_1","volume-title":"Heng Tao Shen, et al","author":"Wang Jingdong","year":"2017"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476255"},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.5555\/645924.671192"},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415541"},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2978386"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.5970"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.5555\/1951721"},{"key":"e_1_2_1_102_1","volume-title":"Zoom: Ssd-based vector search for optimizing accuracy, latency and memory. arXiv preprint arXiv:1809.04067","author":"Zhang Minjia","year":"2018"},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40991-2_42"},{"key":"e_1_2_1_104_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357834"},{"key":"e_1_2_1_105_1","volume-title":"SONG: Approximate Nearest Neighbor Search on GPU. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1033--1044","author":"Zhao Weijie","year":"2020"},{"key":"e_1_2_1_106_1","volume-title":"k-NN graph construction: a generic online approach. arXiv preprint arXiv:1804.03032","author":"Zhao Wan-Lei","year":"2018"},{"key":"e_1_2_1_107_1","volume-title":"On the Merge of k-NN Graph. arXiv preprint arXiv:1908.00814","author":"Zhao Wan-Lei","year":"2019"},{"key":"e_1_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.107040"},{"key":"e_1_2_1_109_1","doi-asserted-by":"publisher","DOI":"10.1109\/CBD.2013.20"},{"key":"e_1_2_1_110_1","doi-asserted-by":"publisher","DOI":"10.1109\/BIBM47256.2019.8982950"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3476249.3476255","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:55:58Z","timestamp":1672221358000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3476249.3476255"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7]]},"references-count":109,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2021,7]]}},"alternative-id":["10.14778\/3476249.3476255"],"URL":"https:\/\/doi.org\/10.14778\/3476249.3476255","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2021,7]]}}}