{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T05:12:21Z","timestamp":1775538741090,"version":"3.50.1"},"reference-count":66,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"name":"Oceanic Interdisciplinary Program of Shanghai Jiao Tong University","award":["SL2023ZD102"],"award-info":[{"award-number":["SL2023ZD102"]}]},{"name":"Alibaba Innovative Research (AIR) Program"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,12,4]]},"abstract":"<jats:p>High-dimensional vector similarity search (HVSS) is critical for many data processing and AI applications. However, traditional HVSS methods often require extensive data access for distance calculations, leading to inefficiencies. Triangle-inequality-based lower bound pruning is a widely used technique to reduce the number of data access in low-dimensional spaces but becomes less effective in high-dimensional settings. This is attributed to the ''distance concentration'' phenomenon, where the lower bounds derived from the triangle inequality become too small to be useful. To address this, we propose TRIM, which enhances the effectiveness of traditional triangle-inequality-based pruning in high-dimensional vector similarity search using two key ways: (1) optimizing landmark vectors used to form the triangles, and (2) relaxing the lower bounds derived from the triangle inequality, with the relaxation degree adjustable according to user's needs. TRIM is a versatile operation that can be seamlessly integrated into both memory-based (e.g., HNSW, IVFPQ) and disk-based (e.g., DiskANN) HVSS methods, reducing distance calculations and disk access. Extensive experiments show that TRIM enhances memory-based methods, improving graph-based search by up to 90% and quantization-based search by up to 200%, while achieving a pruning ratio of up to 99%. It also reduces I\/O costs by up to 58% and improves efficiency by 102% for disk-based methods, while preserving high query accuracy. Our source code is available at https:\/\/github.com\/petrizhang\/TRIM.<\/jats:p>","DOI":"10.1145\/3769838","type":"journal-article","created":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T04:32:13Z","timestamp":1764995533000},"page":"1-26","source":"Crossref","is-referenced-by-count":0,"title":["TRIM: Accelerating High-Dimensional Vector Similarity Search with Enhanced Triangle-Inequality-Based Pruning"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4220-0145","authenticated-orcid":false,"given":"Yitong","family":"Song","sequence":"first","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China and Hong Kong Baptist University, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2416-7556","authenticated-orcid":false,"given":"Pengcheng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-1926-6110","authenticated-orcid":false,"given":"Chao","family":"Gao","sequence":"additional","affiliation":[{"name":"Zilliz, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6478-4209","authenticated-orcid":false,"given":"Bin","family":"Yao","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3123-2184","authenticated-orcid":false,"given":"Kai","family":"Wang","sequence":"additional","affiliation":[{"name":"Antai College of Economics and Management, Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-2575-9558","authenticated-orcid":false,"given":"Zongyuan","family":"Wu","sequence":"additional","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-2028-0780","authenticated-orcid":false,"given":"Lin","family":"Qu","sequence":"additional","affiliation":[{"name":"Taobao, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2025,12,5]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Similarity search in the blink of an eye with compressed indices. arXiv preprint arXiv:2304.04759","author":"Aguerrebere Cecilia","year":"2023","unstructured":"Cecilia Aguerrebere, Ishwar Bhati, Mark Hildebrand, Mariano Tepper, and Ted Willke. 2023. Similarity search in the blink of an eye with compressed indices. arXiv preprint arXiv:2304.04759 (2023)."},{"key":"e_1_2_1_2_1","unstructured":"Meta AI. 2017. FAISS. https:\/\/ai.facebook.com\/tools\/faiss."},{"key":"e_1_2_1_3_1","first-page":"12","article-title":"Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan","volume":"9","author":"Andr\u00e9 Fabien","year":"2016","unstructured":"Fabien Andr\u00e9, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2016a. Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan. In VLDB, Vol. 9. 12.","journal-title":"VLDB"},{"key":"e_1_2_1_4_1","first-page":"12","article-title":"Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan","volume":"9","author":"Andr\u00e9 Fabien","year":"2016","unstructured":"Fabien Andr\u00e9, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2016b. Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan. In VLDB, Vol. 9. 12.","journal-title":"VLDB"},{"key":"e_1_2_1_5_1","first-page":"1666","volume-title":"IEEE TPAMI","volume":"43","author":"Andr\u00e9 Fabien","year":"2019","unstructured":"Fabien Andr\u00e9, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2019. Quicker adc: Unlocking the hidden potential of product quantization with simd. IEEE TPAMI, Vol. 43, 5 (2019), 1666-1677."},{"key":"e_1_2_1_6_1","first-page":"931","article-title":"Additive quantization for extreme vector compression","author":"Babenko Artem","year":"2014","unstructured":"Artem Babenko and Victor Lempitsky. 2014a. Additive quantization for extreme vector compression. In CVPR. 931-938.","journal-title":"CVPR."},{"key":"e_1_2_1_7_1","first-page":"1247","volume-title":"IEEE TPAMI","volume":"37","author":"Babenko Artem","year":"2014","unstructured":"Artem Babenko and Victor Lempitsky. 2014b. The inverted multi-index. IEEE TPAMI, Vol. 37, 6 (2014), 1247-1260."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/361002.361007"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143857"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583318"},{"key":"e_1_2_1_11_1","first-page":"426","article-title":"M-tree: An efficient access method for similarity search in metric spaces","volume":"97","author":"Ciaccia Paolo","year":"1997","unstructured":"Paolo Ciaccia, Marco Patella, and Pavel Zezula. 1997. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, Vol. 97. 426-435.","journal-title":"VLDB"},{"key":"e_1_2_1_12_1","unstructured":"HNSWlib Contributors. 2023. HNSWlib. https:\/\/github.com\/nmslib\/hnswlib."},{"key":"e_1_2_1_13_1","first-page":"537","article-title":"Random projection trees and low dimensional manifolds","author":"Dasgupta Sanjoy","year":"2008","unstructured":"Sanjoy Dasgupta and Yoav Freund. 2008. Random projection trees and low dimensional manifolds. In STOC. 537-546.","journal-title":"STOC."},{"key":"e_1_2_1_14_1","volume-title":"Efficient Data-aware Distance Comparison Operations for High-Dimensional Approximate Nearest Neighbor Search. arXiv preprint arXiv:2411.17229","author":"Deng Liwei","year":"2024","unstructured":"Liwei Deng, Penghao Chen, Ximu Zeng, Tianfu Wang, Yan Zhao, and Kai Zheng. 2024. Efficient Data-aware Distance Comparison Operations for High-Dimensional Approximate Nearest Neighbor Search. arXiv preprint arXiv:2411.17229 (2024)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140587.3062377"},{"key":"e_1_2_1_16_1","first-page":"147","article-title":"Using the triangle inequality to accelerate k-means","author":"Elkan Charles","year":"2003","unstructured":"Charles Elkan. 2003. Using the triangle inequality to accelerate k-means. In ICML. 147-153.","journal-title":"ICML."},{"key":"e_1_2_1_17_1","first-page":"873","article-title":"The concentration of fractional distances","volume":"19","author":"Fran\u00e7ois Damien","year":"2007","unstructured":"Damien Fran\u00e7ois, Vincent Wertz, and Michel Verleysen. 2007. The concentration of fractional distances. IEEE TKDE, Vol. 19, 7 (2007), 873-886.","journal-title":"IEEE TKDE"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3067706"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3725413"},{"key":"e_1_2_1_21_1","unstructured":"Jianyang Gao Yutong Gou Yuexuan Xu Yongyi Yang Cheng Long and Raymond Chi-Wing Wong. 2025b. RaBitQ-Library. https:\/\/github.com\/VectorDB-NTU\/RaBitQ-Library."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589282"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654970"},{"key":"e_1_2_1_24_1","first-page":"2946","article-title":"Optimized product quantization for approximate nearest neighbor search","author":"Ge Tiezheng","year":"2013","unstructured":"Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized product quantization for approximate nearest neighbor search. In IEEE TPAMI. 2946-2953.","journal-title":"IEEE TPAMI."},{"key":"e_1_2_1_25_1","first-page":"156","volume-title":"SODA","volume":"5","author":"Goldberg Andrew V","year":"2005","unstructured":"Andrew V Goldberg and Chris Harrelson. 2005. Computing the shortest path: A search meets graph theory.. In SODA, Vol. 5. 156-165."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.14778\/3397230.3397243"},{"key":"e_1_2_1_27_1","first-page":"3887","article-title":"Accelerating large-scale inference with anisotropic vector quantization","author":"Guo Ruiqi","year":"2020","unstructured":"Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. In ICML. PMLR, 3887-3896.","journal-title":"ICML. PMLR"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSSC.1968.300136"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850469.2850470"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_2_1_31_1","volume-title":"Ravishankar Krishnawamy, and Rohan Kadekodi.","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems, Vol. 32 (2019)."},{"key":"e_1_2_1_32_1","first-page":"117","volume-title":"IEEE TPAMI","volume":"33","author":"Jegou Herve","year":"2010","unstructured":"Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search. IEEE TPAMI, Vol. 33, 1 (2010), 117-128."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/3477.764879"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-13529-3_8"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-71246-8_51"},{"key":"e_1_2_1_36_1","first-page":"2539","article-title":"Improving approximate nearest neighbor search through learned adaptive early termination","author":"Li Conglong","year":"2020","unstructured":"Conglong Li, Minjia Zhang, David G Andersen, and Yuxiong He. 2020. Improving approximate nearest neighbor search through learned adaptive early termination. In ACM SIGMOD. 2539-2554.","journal-title":"ACM SIGMOD."},{"key":"e_1_2_1_37_1","first-page":"1333","article-title":"A general and efficient querying method for learning to hash","author":"Li Jinfeng","year":"2018","unstructured":"Jinfeng Li, Xiao Yan, Jian Zhang, An Xu, James Cheng, Jie Liu, Kelvin KW Ng, and Ti-chung Cheng. 2018. A general and efficient querying method for learning to hash. In ACM SIGMOD. 1333-1347.","journal-title":"ACM SIGMOD."},{"key":"e_1_2_1_38_1","first-page":"1475","article-title":"Approximate nearest neighbor search on high dimensional data-experiments, analyses, and improvement","volume":"32","author":"Li Wen","year":"2019","unstructured":"Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2019. Approximate nearest neighbor search on high dimensional data-experiments, analyses, and improvement. TKDE, Vol. 32, 8 (2019), 1475-1488.","journal-title":"TKDE"},{"key":"e_1_2_1_39_1","first-page":"246","article-title":"HVS: hierarchical graph structure based on voronoi diagrams for solving approximate nearest neighbor search","volume":"15","author":"Lu Kejing","year":"2021","unstructured":"Kejing Lu, Mineichi Kudo, Chuan Xiao, and Yoshiharu Ishikawa. 2021. HVS: hierarchical graph structure based on voronoi diagrams for solving approximate nearest neighbor search. PVLDB, Vol. 15, 2 (2021), 246-258.","journal-title":"PVLDB"},{"key":"e_1_2_1_40_1","volume-title":"Efficient processing of k nearest neighbor joins using mapreduce. PVLDB","author":"Lu Wei","year":"2012","unstructured":"Wei Lu, Yanyan Shen, Su Chen, and Beng Chin Ooi. 2012. Efficient processing of k nearest neighbor joins using mapreduce. PVLDB (2012)."},{"key":"e_1_2_1_41_1","first-page":"824","volume-title":"IEEE TPAMI","volume":"42","author":"Malkov Yu A","year":"2018","unstructured":"Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE TPAMI, Vol. 42, 4 (2018), 824-836."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2321376"},{"key":"e_1_2_1_43_1","unstructured":"OpenAI. 2022. ChatGPT. https:\/\/chat.openai.com. Accessed: 2025-03-19."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588908"},{"key":"e_1_2_1_45_1","first-page":"55","article-title":"Approximate shortest distance computing: A query-dependent local landmark scheme","volume":"26","author":"Qiao Miao","year":"2012","unstructured":"Miao Qiao, Hong Cheng, Lijun Chang, and Jeffrey Xu Yu. 2012. Approximate shortest distance computing: A query-dependent local landmark scheme. IEEE TKDE, Vol. 26, 1 (2012), 55-68.","journal-title":"IEEE TKDE"},{"key":"e_1_2_1_46_1","article-title":"Hubs in space: Popular nearest neighbors in high-dimensional data","volume":"11","author":"Radovanovic Milos","year":"2010","unstructured":"Milos Radovanovic, Alexandros Nanopoulos, and Mirjana Ivanovic. 2010. Hubs in space: Popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research, Vol. 11, sept (2010), 2487-2531.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1006\/jvci.1999.0413"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"e_1_2_1_49_1","unstructured":"A. Stuart and J.K. Ord. 1999. Kendall's Advanced Theory of Statistics Vol. 2A: Classical Inference & the Linear Model. Oxford University Press. 865 pages."},{"key":"e_1_2_1_50_1","volume-title":"SRS: solving c-approximate nearest neighbor queries in high dimensional euclidean space with a tiny index. PVLDB","author":"Sun Yifang","year":"2014","unstructured":"Yifang Sun, Wei Wang, Jianbin Qin, Ying Zhang, and Xuemin Lin. 2014. SRS: solving c-approximate nearest neighbor queries in high dimensional euclidean space with a tiny index. PVLDB (2014)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/1806907.1806912"},{"key":"e_1_2_1_52_1","volume-title":"Analysis of Noncentral Distributions","author":"Torgersen E.","year":"1972","unstructured":"E. Torgersen. 1972. Analysis of Noncentral Distributions. Springer (1972)."},{"key":"e_1_2_1_53_1","first-page":"2614","article-title":"Milvus: A purpose-built vector data management system","author":"Wang Jianguo","year":"2021","unstructured":"Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al., 2021b. Milvus: A purpose-built vector data management system. In ACM SIGMOD. 2614-2627.","journal-title":"ACM SIGMOD."},{"key":"e_1_2_1_54_1","volume-title":"Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints. arXiv preprint arXiv:2203.13601","author":"Wang Mengzhao","year":"2022","unstructured":"Mengzhao Wang, Lingwei Lv, Xiaoliang Xu, Yuxiang Wang, Qiang Yue, and Jiongkang Ni. 2022. Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints. arXiv preprint arXiv:2203.13601 (2022)."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639269"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid51090.2021.00023"},{"key":"e_1_2_1_57_1","unstructured":"Zeyu Wang Haoran Xiong Zhenying He Peng Wang et al. 2024a. Distance Comparison Operators for Approximate Nearest Neighbor Search: Exploration and Benchmark. arXiv preprint arXiv:2403.13491 (2024)."},{"key":"e_1_2_1_58_1","unstructured":"Zeyu Wang Haoran Xiong Zhenying He Peng Wang et al. 2025. Fudist. https:\/\/github.com\/CaucherWang\/Fudist."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415541"},{"key":"e_1_2_1_60_1","volume-title":"Amelie Chi Zhou, and Xiaoyong Du","author":"Xu Qian","year":"2025","unstructured":"Qian Xu, Juan Yang, Feng Zhang, Junda Pan, Kang Chen, Youren Shen, Amelie Chi Zhou, and Xiaoyong Du. 2025a. Tribase. https:\/\/github.com\/xuqianmamba\/Tribase."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3709743"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/2093973.2094001"},{"key":"e_1_2_1_63_1","volume-title":"VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In OSDI.","author":"Zhang Qianxi","year":"2023","unstructured":"Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, et al., 2023. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In OSDI."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.14778\/3594512.3594527"},{"key":"e_1_2_1_65_1","volume-title":"PandaDB: Understanding Unstructured Data in Graph Database. arXiv preprint arXiv:2107.01963","author":"Zhao Zihao","year":"2021","unstructured":"Zihao Zhao, Zhihong Shen, and Mingjie Tang. 2021. PandaDB: Understanding Unstructured Data in Graph Database. arXiv preprint arXiv:2107.01963 (2021)."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639324"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3769838","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T04:31:19Z","timestamp":1775536279000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3769838"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,4]]},"references-count":66,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12,4]]}},"alternative-id":["10.1145\/3769838"],"URL":"https:\/\/doi.org\/10.1145\/3769838","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,4]]}}}