{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T03:18:53Z","timestamp":1758079133026,"version":"3.44.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:p>Vector indexing enables semantic search over diverse corpora and has become an important interface to databases for both users and AI agents. Efficient vector search requires deep optimizations in database systems. This has motivated a new class of specialized vector databases that optimize for vector search quality and cost. Instead, we argue that a scalable, high-performance, and cost-efficient vector search system can be built inside a cloud-native operational database like Azure Cosmos DB while leveraging the benefits of a distributed database such as high availability, durability, and scale. We do this by deeply integrating DiskANN, a state-of-the-art vector indexing library, inside Azure Cosmos DB NoSQL. This system uses a single vector index per partition stored in existing index trees, and kept in sync with underlying data. It supports &lt; 20ms query latency over an index spanning 10 million vectors, has stable recall over updates, and offers approximately 43\u00d7 and 12\u00d7 lower query cost compared to Pinecone and Zilliz serverless enterprise products. It also scales out to billions of vectors via automatic partitioning. This convergent design presents a point in favor of integrating vector indices into operational databases in the context of recent debates on specialized vector databases, and offers a template for vector indexing in other databases.<\/jats:p>","DOI":"10.14778\/3750601.3750635","type":"journal-article","created":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:38:05Z","timestamp":1758029885000},"page":"5166-5183","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Cost-Effective, Low Latency Vector Search with Azure Cosmos DB"],"prefix":"10.14778","volume":"18","author":[{"given":"Nitish","family":"Upreti","sequence":"first","affiliation":[{"name":"Microsoft"}]},{"given":"Harsha Vardhan","family":"Simhadri","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Hari Sudan","family":"Sundar","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Krishnan","family":"Sundaram","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Samer","family":"Boshra","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Balachandar","family":"Perumalswamy","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Shivam","family":"Atri","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Martin","family":"Chisholm","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Revti Raman","family":"Singh","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Greg","family":"Yang","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Tamara","family":"Hass","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Nitesh","family":"Dudhey","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Subramanyam","family":"Pattipaka","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Mark","family":"Hildebrand","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Magdalen","family":"Manohar","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Jack","family":"Moffitt","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Haiyang","family":"Xu","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Naren","family":"Datha","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Suryansh","family":"Gupta","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Ravishankar","family":"Krishnaswamy","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Prashant","family":"Gupta","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Abhishek","family":"Sahu","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Hemeswari","family":"Varada","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Sudhanshu","family":"Barthwal","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Ritika","family":"Mor","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"James","family":"Codella","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Shaun","family":"Cooper","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Kevin","family":"Pilch","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Simon","family":"Moreno","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Aayush","family":"Kataria","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Santosh","family":"Kulkarni","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Neil","family":"Deshpande","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Amar","family":"Sagare","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Dinesh","family":"Billa","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Zishan","family":"Fu","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Vipul","family":"Vishal","sequence":"additional","affiliation":[{"name":"Microsoft"}]}],"member":"320","published-online":{"date-parts":[[2025,9,16]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2016. Tokio Runtime. https:\/\/tokio.rs\/."},{"key":"e_1_2_1_2_1","volume-title":"Graph-Based Algorithms for Diverse Similarity Search. CoRR 2502.13336","author":"Anand Piyush","year":"2025","unstructured":"Piyush Anand, Piotr Indyk, Ravishankar Krishnaswamy, Sepideh Mahabadi, Vikas C. Raykar, Kirankumar Shiragur, and Haike Xu. 2025. Graph-Based Algorithms for Diverse Similarity Search. CoRR 2502.13336 (2025). arXiv:2502.13336 http:\/\/arxiv.org\/abs\/2502.13336"},{"key":"e_1_2_1_3_1","volume-title":"Practical and Optimal LSH for Angular Distance. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015","author":"Andoni Alexandr","year":"2015","unstructured":"Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya P. Razenshteyn, and Ludwig Schmidt. 2015. Practical and Optimal LSH for Angular Distance. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7\u201312, 2015, Montreal, Quebec, Canada, Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett (Eds.). 1225\u20131233. https:\/\/proceedings.neurips.cc\/paper\/2015\/hash\/2823f4797102ce1a1aec05359cc16dd9-Abstract.html"},{"key":"e_1_2_1_4_1","unstructured":"Andrew Kane et al. 2024. https:\/\/github.com\/pgvector\/pgvector"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583166"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/3685800.3685805"},{"key":"e_1_2_1_7_1","volume-title":"Wortman Vaughan (Eds.)","volume":"34","author":"Chen Qi","year":"2021","unstructured":"Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 5199\u20135212. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2021\/file\/299dc35e747eb77177d9cea10a802da2-Paper.pdf"},{"key":"e_1_2_1_8_1","unstructured":"Microsoft Corporation. 2024. Hierarchical partition keys in Azure Cosmos DB. https:\/\/learn.microsoft.com\/en-us\/azure\/cosmos-db\/hierarchical-partition-keys?tabs=net-v3%2Cbicep"},{"key":"e_1_2_1_9_1","unstructured":"Microsoft Corporation. 2024. Indexing policies in Azure Cosmos DB. https:\/\/learn.microsoft.com\/en-us\/azure\/cosmos-db\/index-policy"},{"key":"e_1_2_1_10_1","unstructured":"DataStax. 2025. DataStax Vector Search Pricing. https:\/\/www.datastax.com\/pricing\/vector-search."},{"key":"e_1_2_1_11_1","unstructured":"Cosmos DB. 2024. Vector Index Scenario Suite. https:\/\/github.com\/AzureCosmosDB\/VectorIndexScenarioSuite."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2401.08281"},{"key":"e_1_2_1_13_1","unstructured":"Simon H\u00f8rup Eskildsen. 2024. turbopuffer: fast search on object storage. https:\/\/turbopuffer.com\/blog\/turbopuffer."},{"key":"e_1_2_1_14_1","unstructured":"Yury Malkov et al. 2019. Header-only C++\/python library for fast approximate nearest neighbors. https:\/\/github.com\/nmslib\/hnswlib."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583552"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13\u201318","volume":"119","author":"Guo Ruiqi","year":"2020","unstructured":"Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating Large-Scale Inference with Anisotropic Vector Quantization. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13\u201318 July 2020, Virtual Event (Proceedings of Machine Learning Research), Vol. 119. PMLR, 3887\u20133896. http:\/\/proceedings.mlr.press\/v119\/guo20h.html"},{"key":"e_1_2_1_18_1","unstructured":"HNSW. 2023. HNSW Github. https:\/\/github.com\/nmslib\/hnswlib\/blob\/master\/examples\/python\/EXAMPLES.md"},{"key":"e_1_2_1_19_1","article-title":"Product Quantization for Nearest Neighbor Search","volume":"33","author":"Jegou Herve","year":"2010","unstructured":"Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 33, 1 (2010).","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)"},{"key":"e_1_2_1_20_1","volume-title":"Billion-scale Similarity Search with GPUs. arXiv preprint arXiv:1702.08734","author":"Johnson Jeff","year":"2017","unstructured":"Jeff Johnson, Matthijs Douze, and Herv\u00e9 J\u00e9gou. 2017. Billion-scale Similarity Search with GPUs. arXiv preprint arXiv:1702.08734 (2017)."},{"key":"e_1_2_1_21_1","unstructured":"Jonathan Ellis et al. 2024. JVector. https:\/\/github.com\/jbellis\/jvector"},{"key":"e_1_2_1_22_1","volume-title":"The DiskANN library: Graph-Based Indices for Fast, Fresh and Filtered Vector Search","author":"Krishnaswamy Ravishankar","year":"2025","unstructured":"Ravishankar Krishnaswamy, Magdalen Manohar, and Harsha Vardhan Simhadri. 2025. The DiskANN library: Graph-Based Indices for Fast, Fresh and Filtered Vector Search. IEEE Data Engineering Bulletin 48 (2025). Issue 3."},{"key":"e_1_2_1_23_1","unstructured":"Yury A. Malkov and D. A. Yashunin. 2016. Efficient and Robust Approximate Nearest Neighbor Search using Hierarchical Navigable Small World graphs. CoRR abs\/1603.09320 (2016). arXiv:1603.09320 http:\/\/arxiv.org\/abs\/1603.09320"},{"key":"e_1_2_1_24_1","volume-title":"Harsha Vardhan Simhadri, and Ravishankar Krishnaswamy","author":"Manohar Magdalen","year":"2024","unstructured":"Magdalen Manohar, James Codella, Mark Hildebrand, Harsha Vardhan Simhadri, and Ravishankar Krishnaswamy. 2024. Microsoft DiskANN in Azure Cosmos DB. https:\/\/github.com\/AzureCosmosDB\/DiskANNWhitePapers\/blob\/fc97578ee687189af3a086b35218f368f36b3085\/Microsoft%20DiskANN%20in%20Azure%20Cosmos%20DB.pdf."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3627535.3638475"},{"key":"e_1_2_1_26_1","unstructured":"Matvey Arye et al. 2025. pgvectorscale. https:\/\/github.com\/timescale\/pgvectorscale"},{"key":"e_1_2_1_27_1","unstructured":"Microsoft. 2025. Azure Compliance Offerings. https:\/\/servicetrust.microsoft.com\/DocumentPage\/7adf2d9e-d7b5-4e71-bad8-713e6a183cf3."},{"key":"e_1_2_1_28_1","unstructured":"Pavan Davuluri. 2024. Windows Copilot Runtime. https:\/\/blogs.windows.com\/windowsdeveloper\/2024\/05\/21\/unlock-a-new-era-of-innovation-with-windows-copilot-runtime-and-copilot-pcs\/"},{"key":"e_1_2_1_29_1","unstructured":"Nick Pentreath Abdulla Abdurakhmanov and Rob Royce. 2017. Vector Scoring Plugin for Elasticsearch. https:\/\/github.com\/MLnick\/elasticsearch-vector-scoring"},{"key":"e_1_2_1_30_1","unstructured":"Pinecone. 2025. PineCone Serverless Pricing Documents. https:\/\/docs.pinecone.io\/guides\/organizations\/manage-cost\/understanding-cost#query"},{"key":"e_1_2_1_31_1","volume-title":"Datasets: Cohere\/wikipedia-22-12-en-embeddings. https:\/\/huggingface.co\/datasets\/Cohere\/wikipedia-22-12-en-embeddings.","author":"Reimers Nils","year":"2022","unstructured":"Nils Reimers. 2022. Datasets: Cohere\/wikipedia-22-12-en-embeddings. https:\/\/huggingface.co\/datasets\/Cohere\/wikipedia-22-12-en-embeddings."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824065"},{"key":"e_1_2_1_33_1","unstructured":"Harsha Vardhan Simhadri Martin Aum\u00fcller Amir Ingber Matthijs Douze George Williams Magdalen Dobson Manohar Dmitry Baranchuk Edo Liberty Frank Liu Ben Landrum Mazin Karjikar Laxman Dhulipala Meng Chen Yue Chen Rui Ma Kai Zhang Yuzheng Cai Jiayang Shi Yizhuo Chen Weiguo Zheng Zihao Wan Jie Yin and Ben Huang. 2024. Results of the Big ANN: NeurIPS'23 competition. arXiv:2409.17424 [cs.IR] https:\/\/arxiv.org\/abs\/2409.17424"},{"key":"e_1_2_1_34_1","unstructured":"Harsha Vardhan Simhadri Martin Aum\u00fcller Amir Ingber Matthijs Douze George Williams Magdalen Dobson Manohar Dmitry Baranchuk Edo Liberty Frank Liu Ben Landrum Mazin Karjikar Laxman Dhulipala Meng Chen Yue Chen Rui Ma Kai Zhang Yuzheng Cai Jiayang Shi Yizhuo Chen Weiguo Zheng Zihao Wan Jie Yin and Ben Huang. 2024. Results of the Big ANN: NeurIPS'23 competition. arXiv:2409.17424 [cs.IR] https:\/\/arxiv.org\/abs\/2409.17424"},{"key":"e_1_2_1_35_1","unstructured":"Harsha Vardhan Simhadri Ravishankar Krishnaswamy Gopal Srinivasa Suhas Jayaram Subramanya Andrija Antonijevic Dax Pryce David Kaczynski Shane Williams Siddarth Gollapudi Varun Sivashankar Neel Karia Aditi Singh Shikhar Jaiswal Neelam Mahapatro Philip Adams Bryan Tower and Yash Patel. 2023. DiskANN: Graph-structured Indices for Scalable Fast Fresh and Filtered Approximate Nearest Neighbor Search. https:\/\/github.com\/Microsoft\/DiskANN"},{"key":"e_1_2_1_36_1","volume-title":"Ravishankar Krishnaswamy, and Harsha Vardhan Simhadri.","author":"Singh Aditi","year":"2021","unstructured":"Aditi Singh, Suhas Jayaram Subramanya, Ravishankar Krishnaswamy, and Harsha Vardhan Simhadri. 2021. FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search. CoRR abs\/2105.09613 (2021). arXiv:2105.09613 https:\/\/arxiv.org\/abs\/2105.09613"},{"key":"e_1_2_1_37_1","unstructured":"Rodrigo Souza. 2024. CosmosDB Dev Blog. https:\/\/devblogs.microsoft.com\/cosmosdb\/announcing-cost-and-performance-improvements-with-azure-cosmos-dbs-binary-encoding\/."},{"key":"e_1_2_1_38_1","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Devvrit, Rohan Kadekodi, Ravishankar Krishnaswamy, and Harsha Vardhan Simhadri. 2019. DiskANN: Fast Accurate Billionpoint Nearest Neighbor Search on a Single Node. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8\u201314, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 13748\u201313758. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Abstract.html"},{"key":"e_1_2_1_39_1","unstructured":"Julie Tibshirani. 2019. Text similarity search with vector fields. https:\/\/www.elastic.co\/blog\/text-similarity-search-with-vectors-in-elasticsearch"},{"key":"e_1_2_1_40_1","unstructured":"Benjamin Trent. 2023. Make HNSW merges faster. https:\/\/github.com\/apache\/lucene\/issues\/12440"},{"key":"e_1_2_1_41_1","unstructured":"Benjamin Trent. 2024. Make HNSW merges cheaper on heap. https:\/\/github.com\/apache\/lucene\/issues\/14208"},{"key":"e_1_2_1_42_1","unstructured":"Turbopuffer. 2025. TurboPuffer Pricing. https:\/\/turbopuffer.com\/pricing."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415541"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2502.13826"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3600006.3613166"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331198"},{"key":"e_1_2_1_47_1","volume-title":"17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)","author":"Zhang Qianxi","year":"2023","unstructured":"Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, Mao Yang, and Lidong Zhou. 2023. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). USENIX Association, Boston, MA, 377\u2013395. https:\/\/www.usenix.org\/conference\/osdi23\/presentation\/zhang-qianxi"},{"key":"e_1_2_1_48_1","unstructured":"Zilliz. 2025. Zilliz Serverless Pricing Documents. https:\/\/docs.zilliz.com\/docs\/understand-cost#example"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3750601.3750635","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:38:42Z","timestamp":1758029922000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3750601.3750635"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8]]},"references-count":48,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["10.14778\/3750601.3750635"],"URL":"https:\/\/doi.org\/10.14778\/3750601.3750635","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,8]]},"assertion":[{"value":"2025-09-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}